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MAMMALIAN GENES INVOLVED IN VIRAL INFECTION AND TUMOR 

SUPPRESSION 



BACKGROUND 

5 Field of the Invention 

The present invention provides methods of identifying cellular genes used for 
viral growth or for tumor progression. Thus, the present invention relates to nucleic 
acids related to and methods of reducing or preventing viral infection and for 
suppressing tumor progression. The invention also relates to methods for screening for 
10 additional such genes. 
Background art 

Various projects have been directed toward isolating and sequencing the genome 
of various animals, notably the human. However, most methodologies provide 
nucleotide sequences for which no function is linked or even suggested, thus limiting the 

15 immediate usefulness of such data. 

The present invention, in contrast, provides methods of screening only for 
nucleic acids that are involved in a specific process, i.e., viral infection or tumor 
progression, and further, for nucleic acids useful in treatments for these processes 
because by this method only nucleic acids which are also nonessential to the cell are 

20 isolated. Such methods are highly useful, since they ascribe a function to each isolated 
gene, and thus the isolated nucleic acids can immediately be utilized in various specific 
methods and procedures. 

For, example, the present invention provides methods of isolating nucleic acids 
encoding gene products used for viral infection, but nonessential to the cell. Viral 

25 infections of the intestine and liver are significant causes of human morbidity and 

mortality. Understanding the molecular mechanisms of such infections will lead to new 
approaches in their treatment and control. 

Viruses can establish a variety of types of infection. These infections can be 
generally classified as lytic or persistent, though some lytic infections are considered 

30 persistent. Generally, persistent infections fall into two categories: (1) chronic 

(productive) infection, i.e., infection wherein infectious virus is present and can be 
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recovered by traditional biological methods and (2) latent infection, i.e., infection 
wherein viral genome is present in the cell but infectious virus is generally not produced 
except during intermittent episodes of reactivation. Persistence generally involves 
stages of both productive and latent infection. 
5 Lytic infections can also persist under conditions where only a small fraction of 

the total cells are infected (smoldering (cycling) infection). The few infected cells 
release virus and are killed, but the progeny virus again only infect a small number of the 
total cells. Examples of such smoldering infections include the persistence of lactic 
dehydrogenase virus in mice (Many, B.W.J., Br. Med. Bull. 41: 50-55 (1985)) and 
10 adenovirus infection in humans (Porter, D.D. pp. 784-790 in Baron, S., ed. Medical 
Microbiology 2d ed. (Addison-Wesley, Menlo Park, CA 1985)). 

Furthermore, a virus may be lytic for some cell types but not for others. For 
example, evidence suggests that human immunodeficiency virus (HIV) is more lytic for 
T cells than for monocytes/macrophages, and therefore can result in a productive 
15 infection of T cells that can result in cell death, whereas fflV-infected mononuclear 
phagocytes may produce virus for considerable periods of time without cell lysis. 
(Klatzmann, et al. Science 225:59-62 (1984); Koyanagi, et al. Science 241:1673-1675 
(1988); Sattentau, et al. Cell 52:631-633 (1988)). 

Traditional treatments for viral infection include pharmaceuticals aimed at 
20 specific virus derived proteins, such as HIV protease or reverse transcriptase, or 
recombinant (cloned) immune modulators (host derived), such as the interferons. 
However, the current methods have several limitations and drawbacks which include 
high rates of viral mutations which render anti-viral pharmaceuticals ineffective. For 
immune modulators, limited effectiveness, limiting side effects, a lack of specificity all 
25 limit the general applicability of these agents. Also the rate of success with current 
antivirals and immune-modulators has been disappointing. 

The current invention focuses on isolating genes that are not essential for cellular 
survival when disrupted in one or both alleles, but which are required for virus 
replication. This may occur with a dose effect, in which one allele knock-out may 
30 confer the phenotype of virus resistance for the cell. As targets for therapeutic 

intervention, inhibition of these cellular gene products, including: proteins, parts of 
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proteins (modification enzymes that include, but are not restricted to glycosylation, lipid 
modifiers [myriolate, etc.]), lipids, transcription elements and RNA regulatory 
molecules, may be less likely to have profound toxic side effects and virus mutation is 
less likely to overcome the 'block' to replicate successfully. 
5 The present invention provides a significant improvement over previous methods 

of attempted therapeutic intervention against viral infection by addressing the cellular 
genes required by the virus for growth. Therefore, the present invention also provides 
an innovative therapeutic approach to intervention in viral infection by providing 
methods to treat viruses by inhibiting the cellular genes necessary for viral infection. 

10 Because these genes, by virtue of the means by which they are originally detected, are 
nonessential to the cell's survival, these treatment methods can be used in a subject 
without serious detrimental effects to the subject, as has been found with previous 
methods. The present invention also provides the surprising discovery that virally 
infected cells are dependent upon a factor in serum to survive. Therefore, the present 

IS invention also provides a method for treating viral infection by inhibiting this serum 
survival factor. Finally, these discoveries also provide a novel method for removing 
virally infected cells from a cell culture by removing, inhibiting or disrupting this serum 
survival factor in the culture so that non-infected cells selectively survive. 

The selection of tumor suppressor gene(s) has become an important area in the 

20 discovery of new target for therapeutic intervention of cancer. Since the discovery that 
cells are restricted from promiscuous entry into the cell cycle by specific genes that are 
capable of suppressing a "transformed 1 phenotype, considerable time has been invested 
in the discovery of such genes. Some of these genes include the gene associated by 
rhabdomyosarcoma (Rb) and the p53 (apoptosis related) encoding gene. The present 

25 invention provides a method, using gene-trapping, to select cell lines that have 

transformed phenotype from ceils that are not transformed and to isolate from these 
cells a gene that can suppress a malignant phenotype. Thus, by the nature of the 
isolation process, a function is associated with the isolated genes. The capacity to select 
quickly tumor suppressor genes can provide unique targets in the process of treating or 

30 preventing, and even for diagnostic testing of, cancer. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention utilizes a "gene trap" method along with a selection 
process to identify and isolate nucleic acids from genes associated with a particular 
5 function. Specifically, it provides a means of isolating cellular genes necessary for viral 
infection but not essential for the cell's survival, and it provides a means of isolating 
cellular genes that suppress tumor progression. 

The present invention also provides a core discovery that virally infected cells 
become dependent upon at least one factor present in serum for survival, whereas non- 
10 infected cells do not exhibit this dependence. This core discovery has been utilized in 
the present invention in several ways. First, inhibition of the "serum survival factor" can 
be utilized to eradicate persistently virally infected cells from populations of non-infected 
cells. Inhibition of this factor can also be used to treat virus infection in a subject, as 
further described herein. Additionally, inhibition of or withdrawal of the serum survival 
15 factor in tissue culture allows for the detection of cellular genes required for viral 
replication yet nonessential for an uninfected cell to survive. The present invention 
further provides several such cellular genes, as well as methods of treating viral 
infections by inhibiting the functioning of such genes. 

Furthermore, the present invention provides a method for isolation of cellular 

20 genes utilized in tumor progression. 

The present method provides several cellular genes that are necessary for viral 
growth in the cell but are not essential for the cell to survive. These genes are important 
for lytic and persistent infection by viruses. These genes were isolated by generating 
gene trap libraries by infecting cells with a retrovirus gene trap vector, selecting for cells 

25 in which a gene trap event occurred (i.e., in which the vector had inserted such that the 
promoterless marker gene was inserted such that a cellular promoter promotes 
transcription of the marker gene, i.e., inserted into a functioning gene), starving the cells 
of serum, infecting the selected cells with the virus of choice while continuing serum 
starvation, and adding back serum to allow visible colonies to develop, which colonies 

30 were cloned by limiting dilution. Genes into which the retrovirus gene trap vector 
inserted were then isolated from the colonies using probes specific for the retrovirus 
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gene trap vector. Thus nucleic acids isolated by this method are isolated portions of 
genes. 

Thus the present invention provides a method of identifying a cellular gene 
necessary for viral growth in a cell and nonessential for cellular survival, comprising (a) 
5 transferring into a cell culture growing in serum-containing medium a vector encoding a 
selective marker gene lacking a functional promoter, (b) selecting cells expressing the 
marker gene, (c) removing serum from the culture medium, (d) infecting the cell 
culture with the virus, and (e) isolating from the surviving cells a cellular gene within 
which the marker gene is inserted, thereby identifying a gene necessary for viral growth 

10 in a cell and nonessential for cellular survival. The present invention also provides a 
method of identifying a cellular gene used for viral growth in a cell and nonessential for 
cellular survival, comprising (a) transferring into a cell culture growing in serum- 
containing medium a vector encoding a selective marker gene lacking a functional 
promoter, (b) selecting cells expressing the marker gene, (c) removing serum from 

15 the culture medium, (d) infecting the cell culture with the virus, and (e) isolating from 
the surviving cells a cellular gene within which the marker gene is inserted, thereby 
identifying a gene necessary for viral growth in a cell and nonessential for cellular 
survival. In any selected cell type, such as Chinese hamster ovary cells, one can readily 
determine if serum starvation is required for selection. If it is not, serum starvation may 

20 be eliminated from the steps. 

Alternatively, instead of removing serum from the culture medium, a serum 
factor required by the virus for growth can be inhibited, such as by the administration of 
an antibody that specifically binds that factor. Furthermore, if it is believed that there 
are no persistently infected cells in the culture, the serum starvation step can be 

25 eliminated and the cells grown in usual medium for the cell type. If serum starvation is 
used, it can be continued for a time after the culture is infected with the virus. Serum 
can then be added back to the culture. If some other method is used to inactivate the 
factor, it can be discontinued, inactivated or removed (such as removing the anti-factor 
antibody, e.g., with a bound antibody directed against that antibody) prior to adding 

30 fresh serum back to the culture. Cells that survive are mutants having an inactivating 
insertion in a gene necessary for growth of the virus. The genes having the insertions 
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can then be isolated by isolating sequences having the marker gene sequences. This 
mutational process disturbs a wild type function. A mutant gene may produce at a 
lower level a normal product, it may produce a normal product not normally found in 
these cells, it may cause the overproduction of a normal product, it may produce an 

5 altered product that has some functions but not others, or it may completely disrupt a 
gene function. Additionally, the mutation may disrupt an RNA that has a function but 
is never translated into a protein. For example, the alpha-tropomyosin gene has a 3' 
RNA that is very important in cell regulation but never is translated into protein. (Cell 
75 pg 1107-1117, 12/17/93). 

10 As used herein, a cellular gene "nonessential for cellular survival" means a gene 

for which disruption of one or both alleles results in a cell viable for at least a period of 
time which allows viral replication to be inhibited for preventative or therapeutic uses or 
use in research. A gene "necessary for viral growth" means the gene product, either 
protein or RNA, secreted or not, is necessary, either directly or indirectly in some way 

15 for the virus to grow, and therefore, in the absence of that gene product (i.e., a 

functionally available gene product), at least some of the cells containing the virus die. 
For example, such genes can encode cell cycle regulatory proteins, proteins affecting the 
vacuolar hydrogen pump, or proteins involved in protein folding and protein 
modification, including but not limited to: phosphorylation, methylation, glycosylation, 

20 myrislation or other lipid moiety, or protein processing via enzymatic processing. Some 
examples of such genes are exemplified herein, wherein some of the isolated nucleic 
acids correspond to genes such as vacuolar H+ATPase, alpha tropomyosin, gas5 gene, 
ras complex, N-acetyl-glucosaminyltransferase I mRNA, and calcyclin. 

Any virus capable of infecting the cell can be used for this method. Virus can 

25 be selected based upon the particular infection desired to study. However, it is 

contemplated by the present invention that many viruses will be dependent upon the 
same cellular genes for survival; thus a cellular gene isolated using one virus can be used 
as a target for therapy for other viruses as well. Any cellular gene can be tested for 
relevancy to any desired virus using the methods set forth herein, i.e., in general, by 

30 inhibiting the gene or its gene product in a cell and determining if the desired virus can 
grow in that cell. Some examples of viruses include HIV (including HIV-1 and HIV-2); 
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20 
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parvovirus; papillomaviruses; hantaviruses; influenza viruses (e.g., influenza A, B and C 
viruses); hepatitis viruses A to G; caliciviruses; astroviruses; rotaviruses; 
coronaviruses, such as human respiratory coronavirus; picornaviruses, such as human 
rhinovirus and enterovirus; ebola virus; human herpesvirus (e.g., HSV-1-9); human 
cytomegalovirus; human adenovirus; Epstein-Barr virus; hantaviruses; for animal, the 
animal counterpart to any above listed human virus, animal retroviruses, such as simian 
immunodeficiency virus, avian immunodeficiency virus, bovine immunodeficiency virus, 
feline immunodeficiency virus, equine infectious anemia virus, caprine arthritis 
encephalitis virus or visna virus. 

The nucleic acids comprising cellular genes of this invention were isolated by the 
above method and as set forth in the examples. The invention includes a nucleic acid 
comprising the nucleotide sequence set forth in SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID 
NO: 7, SEQ ID NO: 8, SEQIDNO:9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, 



NO: 17 
NO: 22 
NO: 27 
NO:32 
NO:37 
NO: 42 
NO:47 
NO: 52 
NO:57 
NO:62 
NO:67 



SEQ ID NO: 18 
SEQ ID NO: 23 
SEQ ID NO: 28 
SEQ ID NO:33 
SEQ ID NO:38 
SEQ ID NO:43 
SEQ ID NO:48 
SEQ ID NO: 53 
SEQ ID NO:58 
SEQ ID NO:63 
SEQ ID NO:68 
SEQ ID NO: 73 



SEQ ID NO: 15, 


SEQ ID NO: 16, 


SEQ 


ID 


SEQ ID NO:20, 


SEQIDNO.21, 


SEQ 


ID 


SEQ ID NO:25, 


SEQ ID NO:26, 


SEQ 


ID . 


SEQ ID NO: 30, 


SEQ IDNO:31, 


SEQ 


ID 


SEQ ID NO: 3 5, 


SEQ ID NO:36, 


SEQ 


ID 


SEQ ID NO:40, 


SEQIDNO:41, 


SEQ 


ID 


SEQ ID NO:45, 


SEQ ID NO:46, 


SEQ 


ID 


SEQ ID NO: 50, 


SEQIDNO:51, 


SEQ 


ID 


SEQ ID NO: 55, 


SEQ ID NO: 56, 


SEQ 


ID 


SEQ ID NO:60, 


SEQIDNO:61, 


SEQ 


ID 


SEQ ID NO:65, 


SEQ ID NO:66, 


SEQ 


ID 


SEQ ID NO.70, 


SEQIDNO:71, 


SEQ 


ID 



SEQ ID NO: 19 
SEQ ID NO:24 
SEQ ID NO:29 
SEQ ID NO:34 
SEQ ID NO:39 
SEQ ID NO:44 
SEQ ID NO:49 
SEQ ID NO 54 
SEQ ID NO: 59 
SEQ ID NO .64 
SEQ ID NO:69 

SEQ ID NO: 74 or SEQ ID NO: 75 (this list is sometimes 



30 



NO: 72, 

referred to herein as "SEQ ID NO:5 through SEQ ID NO:75" for brevity). Thus these 
nucleic acids can contain, in addition to the nucleotides set forth in each SEQ ID NO in 
the sequence listing, additional nucleotides at either end of the molecule. Such 
additional nucleotides can be added by any standard method, as known in the art, such 
as recombinant methods and synthesis methods. Examples of such nucleic acids 
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8 

comprising the nucleotide sequence set forth in any entry of the sequence listing 
contemplated by this invention include, but are not limited to, for example, the nucleic 
acid placed into a vector; a nucleic acid having one or more regulatory region (e.g., 
promoter, enhancer, polyadenylation site) linked to it, particularly in functional manner, 
i.e. such that an mRNA or a protein can be produced; a nucleic acid including additional 
nucleic acids of the gene, such as a larger or even full length genomic fragment of the 
gene, a partial or full length cDNA, a partial or full length RNA. Making and/or 
isolating such larger nucleic acids is further described below and is well known and 
standard in the art. 

The invention also provides a nucleic acid encoding the protein encoded by the 
gene comprising the nucleotide sequence set forth in SEQ ID NO:5, SEQ ID NO:6, 
SEQ ID NO: 7, SEQ ID NO:8, SEQIDNO:9, SEQ ID NO: 10, SEQIDNO:ll, 



SEQ ID NO: 12, SEQ ID NO: 13 

SEQ ID NO: 17, SEQ ID NO: 18 

SEQIDNO:22, SEQlDNO:23 

SEQ ID NO: 27, SEQIDNO:28 

SEQIDN0 32, SEQIDNO:33 

SEQIDNO:37, SEQIDNO:38 

SEQIDNO:42, SEQ ID NO:43 

SEQIDN047, SEQIDNO:48 

SEQIDNO:52, SEQIDNO:53 

SEQ ID N0.57, SEQ ID NO:58 

SEQIDNO:62, SEQIDNO:63 



SEQIDNO:67, SEQIDNO:68 
SEQIDNO:72, SEQ ID NO: 73 



SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, 
SEQ ID NO: 19, SEQ ID NO:20, SEQIDNO:21, 
SEQIDNO:24, SEQ ID NO:25, SEQ ID NO:26, 
SEQIDNO:29, SEQIDNO:30, SEQIDNO:31, 
SEQIDNO:34, SEQIDNO:35, SEQIDNO:36, 
SEQIDNO:39, SEQIDNO:40, SEQIDNO:41, 
SEQIDNO:44, SEQ ID NO:45, SEQIDNO:46, 
SEQIDNO:49, SEQIDNO:50, SEQIDNO:51, 
SEQIDNO:54, SEQIDNO:55, SEQIDNO:56, 
SEQIDNO:59, SEQ ID NO: 60, SEQIDNO:61, 
SEQIDNO:64, SEQ ID NO:65, SEQIDNO:66, 



SEQIDNO:69, SEQ ID NO: 70, SEQIDNO:71, 
SEQ ID NO:74 or SEQ ID NO:7S , as well as allelic 
variants and homologs of each such gene. The gene is readily obtained using standard 
methods, as described below and as is known and standard in the art. The present 
invention also contemplates any unique fragment of these genes or of the nucleic acids 
set forth in any of SEQ ID NO:5 through SEQ ID NO:75. Examples of inventive 
fragments of the inventive genes are the nucleic acids whose sequence is set forth in any 
of SEQ ID NO:5 through SEQ ID NO:75. To be unique, the fragment must be of 
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sufficient size to distinguish it from other known sequences, most readily determined by 
comparing any nucleic acid fragment to the nucleotide sequences of nucleic acids in 
computer databases, such as GenBank. Such comparative searches are standard in the 
art. Typically, a unique fragment useful as a primer or probe will be at least about 20 to 
about 25 nucleotides in length, depending upon the specific nucleotide content of the 
sequence. Additionally, fragments can be, for example, at least about 30, 40, SO, 75, 
100, 200 or 500 nucleotides in length. The nucleic acids can be single or double 
stranded, depending upon the purpose for which it is intended. 

The present invention further provides a nucleic acid comprising the regulatory 
region of a gene comprising the nucleotide sequences set forth in SEQ ID NO: 5, SEQ 
IDNO:6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID 
NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID 



NO: 16 
NO:21 

15 NO:26 
NO:31 
NO:36 
NO:41 
NO:46 

20 NO:51 
NO:56 
NO:61 
NO:66 
NO:71 



SEQ ID NO: 17, SEQ ID NO: 18 

SEQIDNO:22, SEQIDNO:23 

SEQIDNO:27, SEQIDNO:28 

SEQIDNO:32, SEQ ID NO: 3 3 

SEQ ID NO:37, SEQ ID NO:38 

SEQ ID NO:42, SEQ ID NO:43 

SEQIDNO:47, SEQIDNO:48 

SEQIDNO:52, SEQIDNO:53 

SEQIDNO:57, SEQIDNO:58 

SEQ ID NO.62, SEQ ID NO:63 

SEQ ID NO:67, SEQ ID NO: 68 

SEQIDNO:72, SEQ ID NO: 73 



SEQ ID NO: 19, SEQIDNO:20, SEQ ID 
SEQIDNO:24, SEQIDNO:25, SEQ ID 

SEQIDNO:29, SEQIDNO:30, SEQ ID 

SEQIDNO:34, SEQIDNO:35, SEQ ID 

SEQ ID NO: 39, SEQIDNO:40, SEQ ID 

SEQ ID NO: 44, SEQIDNO.45, SEQ ID 

SEQIDNO:49, SEQIDNO:50, SEQ ID 

SEQIDNO:54, SEQ ID NO: 5 5, SEQ ID 

SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID 

SEQIDNO:64, SEQ ID NO:65, SEQ ID 

SEQIDNO:69, SEQIDNO:70, SEQ ID 
SEQ ID NO: 74, SEQ ID NO: 75. 



25 Additionally provided is a construct comprising such a regulatory region functionally 
linked to a reporter gene. Such reporter gene constructs can be used to screen for 
compounds and compositions that affect expression of the gene comprising the nucleic 
acids whose sequence is set forth in any of SEQ ID NO: 5 through SEQ ID NO: 75. 

The nucleic acids set forth in the sequence listing are gene fragments; the entire 

30 coding sequence and the entire gene that comprises each fragment are both 

contemplated herein and are readily obtained by standard methods, given the nucleotide 
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sequences presented in the sequence listing (see. e.g.. Sambrook et al.. Molecular 
Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1989; DNA cloning: A Practical Approach, Volumes I and II, 
Glover, D M. ed., IRL Press Limited, Oxford, 1985). To obtain the entire genomic 
5 gene, briefly, a nucleic acid whose sequence is set forth in any of SEQ ID NO: 1 through 
SEQ ID NO:83, or preferably in any of SEQ ID NO:5 through SEQ ID NO:83, or a 
smaller fragment thereof, is utilized as a probe to screen a genomic library under high 
stringency conditions, and isolated clones are sequenced. Once the sequence of the new 
clone is determined, a probe can be devised from a portion of the new clone not present 
10 in the previous fragment and hybridized to the library to isolate more clones containing 
fragments of the gene. In this manner, by repeating this process in organized fashion, 
one can "walk" along the chromosome and eventually obtain nucleotide sequence for the 
entire gene. Similarly, one can use portions of the present fragments, or additional 
fragments obtained from the genomic library, that contain open reading frames to 
15 screen a cDNA library to obtain a cDNA having the entire coding sequence of the gene 
Repeated screens can be utilized as described above to obtain the complete sequence 
from several clones if necessary. The isolates can then be sequenced to determine the 
nucleotide sequence by standard means such as dideoxynucleotide sequencing methods 
(see, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold 
20 Spring Harbor Laboratory, Cold Spring Harbor, New York, 1 989). 

The present genes were isolated from rat; however, homologs in any desired 
species, preferably mammalian, such as human, can readily be obtained by screening a 
human library, genomic or cDNA, with a probe comprising sequences of the nucleic 
acids set forth in the sequence listing herein, or fragments thereof, and isolating genes 
25 specifically hybridizing with the probe under preferably relatively high stringency 
hybridization conditions. For example, high salt conditions (e.g., in 6X SSC or 6X 
SSPE) and/or high temperatures of hybridization can be used. For example, the 
stringency of hybridization is typically about 5°C to 20°C below the T m (the melting 
temperature at which half of the molecules dissociate from its partner) for the given 
30 chain length. As is known in the art, the nucleotide composition of the hybridizing 

region factors in determining the melting temperature of the hybrid. For 20mer probes, 
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for example, the recommended hybridization temperature is typically about 55-58 °C. 
Additionally, the rat sequence can be utilized to devise a probe for a homolog in any 
specific animal by determining the amino acid sequence for a portion of the rat protein, 
and selecting a probe with optimized codon usage to encode the amino acid sequence of 
5 the homolog in that particular animal. Any isolated gene can be confirmed as the 

targeted gene by sequencing the gene to determine it contains the nucleotide sequence 
listed herein as comprising the gene. Any homolog can be confirmed as a homolog by 
its functionality. 

Additionally contemplated by the present invention are nucleic acids, from any 

10 desired species, preferably mammalian and more preferably human, having 98%, 95%, 
90%, 85%, 80%, 70%, 60%, or 50% homology, or greater, in the region of homology, 
to a region in an exon of a nucleic acid encoding the protein encoded by the gene 
comprising the nucleotide sequence set forth in any of SEQ ID NO:5 through SEQ ID 
NO: 75 of the sequence listing or to homologs thereof. Also contemplated by the 

15 present invention are nucleic acids, from any desired species, preferably mammalian and 
more preferably human, having 98%, 95%, 90%, 85%, 80%, 70%, 60%, or 50% 
homology, or greater, in the region of homology, to a region in an exon of a nucleic acid 
comprising the nucleotide sequence set forth in any of SEQ ID NO:5 through SEQ ID 
NO: 75 of the sequence listing or to homologs thereof These genes can be synthesized 

20 or obtained by the same methods used to isolate homologs, with stringency of 
hybridization and washing, if desired, reduced accordingly as homology desired is 
decreased, and further, depending upon the G-C or A-T richness of any area wherein 
variability is searched for. Allelic variants of any of the present genes or of their 
homologs can readily be isolated and sequenced by screening additional libraries 

25 following the protocol above. Methods of making synthetic genes are described in U.S. 
Patent No. 5,503,995 and the references cited therein. 

The nucleic acid encoding any selected protein of the present invention can be 
any nucleic acid that functionally encodes that protein. For example, to functionally 
encode, i.e., allow the nucleic acid to be expressed, the nucleic acid can include, for 

30 example, exogenous or endogenous expression control sequences, such as an origin of 
replication, a promoter, an enhancer, and necessary information processing sites, such as 
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ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional 
terminator sequences. Preferred expression control sequences can be promoters derived 
from metallothionine genes, actin genes, immunoglobulin genes, CMV, SV40, 
adenovirus, bovine papilloma virus, etc. Expression control sequences can be selected 
5 for functionality in the cells in which the nucleic acid will be placed. A nucleic acid 
encoding a selected protein can readily be determined based upon the amino acid 
sequence of the selected protein, and, clearly, many nucleic acids will encode any 
selected protein. 

The present invention additionally provides a nucleic acid that selectively 

10 hybridizes under stringent conditions with a nucleic acid encoding the protein encoded 
by the gene comprising the nucleotide sequence set forth in any sequence listed herein 
(/.(?., any of SEQ ID NO:5 through SEQ ID NO:75) This hybridization can be specific. 
The degree of complementarity between the hybridizing nucleic acid and the sequence to 
which it hybridizes should be at least enough to exclude hybridization with a nucleic acid 

15 encoding an unrelated protein. Thus, a nucleic acid that selectively hybridizes with a 
nucleic acid of the present protein coding sequence will not selectively hybridize under 
stringent conditions with a nucleic acid for a different, unrelated protein, and vice versa. 
Typically, the stringency of hybridization to achieve selective hybridization involves 
hybridization in high ionic strength solution (6X SSC or 6X SSPE) at a temperature that 

20 is about 12-25°C below the T m (the melting temperature at which half of the molecules 
dissociate from its partner) followed by washing at a combination of temperature and 
salt concentration chosen so that the washing temperature is about 5°C to 20°C below 
the T m of the hybrid molecule. The temperature and salt conditions are readily 
determined empirically in preliminary experiments in which samples of reference DNA 

25 immobilized on filters are hybridized to a labeled nucleic acid of interest and then 
washed under conditions of different stringencies. Hybridization temperatures are 
typically higher for DNA-RNA and RNA-RNA hybridizations. The washing 
temperatures can be used as described above to achieve selective stringency, as is 
known in the art. (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd 

30 Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989; Kunkel et 
al. Methods Enzymol 1987:154:367, 1987). Nucleic acid fragments that selectively 
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hybridize to any given nucleic acid can be used, e.g., as primers and or probes for further 
hybridization or for amplification methods {e.g., polymerase chain reaction (PCR), ligase 
chain reaction (LCR)) . A preferable stringent hybridization condition for a DNA:DNA 
hybridization can be at about 68 °C (in aqueous solution) in 6X SSC or 6X SSPE 
5 followed by washing at 68 ° C . 

The present invention additionally provides a protein encoded by a nucleic acid 
encoding the protein encoded by the gene comprising any of the nucleotide sequences 
set forth herein (i.e. ., any of SEQ ID NO: 5 through SEQ ID NO:75). The protein can 

i 

be readily obtained by any of several means. For example, the nucleotide sequence of 

10 coding regions of the gene can be translated and then the corresponding polypeptide can 
be synthesized mechanically by standard methods. Additionally, the coding regions of 
the genes can be expressed or synthesized, an antibody specific for the resulting 
polypeptide can be raised by standard methods (see, e.g., Harlow and Lane, Antibodies: 
A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, New 

15 York, 1988), and the protein can be isolated from other cellular proteins by selective 
hybridization with the antibody. This protein can be purified to the extent desired by 
standard methods of protein purification (see, e.g., Sambrook et al., Molecular Cloning: 
A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, 
New York, 1989). The amino acid sequence of any protein, polypeptide or peptide of 

20 this invention can be deduced from the nucleic acid sequence, or it can be determined by 
sequencing an isolated or recombinantly produced protein. 

The terms "peptide," "polypeptide M and "protein" are used interchangeably herein 
and refer to a polymer of amino acids and includes full-length proteins and fragments 
thereof As used in the specification and in the claims, "a" can mean one or more, 

25 depending upon the context in which it is used. An amino acid residue is an amino acid 
formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages. 
The amino acid residues described herein are preferably in the "L" isomeric form. 
However, residues in the "D w isomeric form can be substituted for any L-amino acid 
residue, as long as the desired functional property is retained by the polypeptide. 

30 Standard polypeptide nomenclature (described in J. Biol Chem., 243:3552-59 (1969) 
and adopted at 37 CFR § 1.822(b)) is used herein. 
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As will be appreciated by those skilled in the art, the invention also includes 
those polypeptides having slight variations in amino acid sequences or other properties. 
Amino acid substitutions can be selected by known parameters to be neutral (see, e.g., 
Robinson WE Jr, and Mitchell WM., AIDS 4:S151-S162(1990)). Such variations may 

5 arise naturally as allelic variations (e.g., due to genetic polymorphism) or may be 

produced by human intervention (e.g., by mutagenesis of cloned DNA sequences), such 
as induced point, deletion, insertion and substitution mutants. Minor changes in amino 
acid sequence are generally preferred, such as conservative amino acid replacements, 
small internal deletions or insertions, and additions or deletions at the ends of the 

10 molecules. Substitutions may be designed based on, for example, the model of Dayhoff, 
et al. (in Atlas of Protein Sequence and Structure 1978, Nafl Biomed. Res. Found., 
Washington, DC). These modifications can result in changes in the amino acid 
sequence, provide silent mutations, modify a restriction site, or provide other specific 
mutations. Likewise, such amino acid changes result in a different nucleic acid encoding 

15 the polypeptides and proteins. Thus, alternative nucleic acids are also contemplated by 

such modifications. 

The present invention also provides cells containing a nucleic acid of the 
invention. A cell containing a nucleic acid encoding a protein typically can replicate the 
DNA and, further, typically can express the encoded protein The cell can be a 

20 prokaryotic cell, particularly for the purpose of producing quantities of the nucleic acid, 
or a eukaryotic cell, particularly a mammalian cell. The cell is preferably a mammalian 
cell for the purpose of expressing the encoded protein so that the resultant produced 
protein has mammalian protein processing modifications. 

Nucleic acids of the present invention can be delivered into cells by any selected 

25 means, in particular depending upon the purpose of the delivery of the compound and 
the target cells. Many delivery means are well-known in the art. For example, 
electroporation, calcium phosphate precipitation, microinjection, cationic or anionic 
liposomes, and liposomes in combination with a nuclear localization signal peptide for 
delivery to the nucleus can be utilized, as is known in the art. 

30 The present invention also contemplates that the mutated cellular genes 

necessary for viral growth, produced by the present method, as well as cells containing 
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these mutants can also be useful. These mutated genes and cells containing them can be 
isolated and/or produced according to the methods herein described and using standard 
methods. 

It should be recognized that the sequences set forth herein may contain minor 
5 sequencing errors. Such errors can be corrected, for example, by using the hybridization 
procedure described above with various probes derived from the described sequences 
such that the coding sequence can be reisolated and resequenced. 

As described in the examples, the present invention provides the discovery of a 
"serum survival factor" present in serum that is necessary for the survival of persistently 

10 virally infected cells. Isolation and characterization of this factor have shown it to be a 
protein, to have a molecular weight of between about 50 kD and 100 kD, to resist 
inactivation in low pH {e.g., pH2) and chloroform extraction, to be inactivated by 
boiling for about 5 minutes and in low ionic strength solution (e.g., about 10 mM to 
about 50 mM). The present invention thus provides a purified mammalian serum 

15 protein having a molecular weight of between about 50 kD and 100 kD which resists 
inactivation in low pH and resists inactivation by chloroform extraction, which 
inactivates when boiled and inactivates in low ionic strength solution, and which when 
removed from a cell culture comprising cells persistently infected with reovirus 
selectively substantially prevents survival of cells persistently infected with reovirus. 

20 The factor, fitting the physical characteristics described above, can readily be verified by 
adding it to non-serum-containing medium (which previously could not support survival 
of persistently virally infected cells) and determining whether this medium with the 
added putative factor can now support persistently virally infected cells, particularly cells 
persistently infected with reovirus. As used herein, a "purified" protein means the 

25 protein is at least of sufficient purity such that an approximate molecular weight can be 
determined. 

The amino acid sequence of the protein can be elucidated by standard methods. 
For example, an antibody to the protein can be raised and used to screen an expression 
library to obtain nucleic acid sequence coding the protein. This nucleic acid sequence is 
30 then simply translated into the corresponding amino acid sequence. Alternatively, a 
portion of the protein can be directly sequenced by standard amino acid sequencing 
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methods (amino-terminus sequencing). This amino acid sequence can then be used to 
generate an array of nucleic acid probes that encompasses all possible coding sequences 
for a portion of the amino acid sequence. The array of probes is used to screen a cDNA 
library to obtain the remainder of the coding sequence and thus ultimately the 
5 corresponding amino acid sequence. 

The present invention also provides methods of detecting and isolating additional 
serum survival factors. For example, to determine if any known serum components are 
necessary for viral growth, the known components can be inhibited in, or eliminated 
from, the culture medium, and it can be observed whether viral growth is inhibited by 

10 determining if persistently infected cells do not survive. One can add the factor back (or 
remove the inhibition) and determine whether the factor allows for viral growth. 

Additionally, other, unknown serum components can also be found to be 
essential for viral growth. Serum can be fractionated by various standard means, and 
fractions added to serum free medium to determine if a factor is present in a reaction 

15 that allows viral growth previously inhibited by the lack of serum. Fractions having this 
activity can then be further fractionated until the factor is relatively free of other 
components. The factor can then be characterized by standard methods, such as size 
fractionation, denaturation and/or inactivation by various means, etc. Preferably, once 
the factor has been purified to a desired level of purity, it is added to cells in serum free 

20 medium to confirm that it bestows the function of allowing virus to grow when serum- 
free medium alone did not This method can be repeated to confirm the requirement for 
the specific factor for any desired virus, since each serum factor found to be required by 
any one virus can also be required by many other viruses. In general, the closer the 
viruses are related and the more similar the infection modes of the viruses, the more 

25 likely that a factor required by one virus will be required by the other. 

The present invention also provides methods of treating virus infections utilizing 
applicants' discoveries. The subject of any of the herein described methods can be any 
animal, preferably a mammal, such as a human, a veterinary animal, such as a cat, dog, 
horse, pig, goat, sheep, or cow, or a laboratory animal, such as a mouse, rat, rabbit, or 

30 guinea pig, depending upon the virus. 
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The present invention provides a method of reducing or inhibiting, and thereby 
treating, a viral infection in a subject, comprising administering to the subject an 
inhibiting amount of a composition that inhibits functioning of the serum protein 
described herein, i.e. the serum protein having a molecular weight of between about 50 
5 kD and 100 kD which resists inactivation in low pH and resists inactivation by 
chloroform extraction, which inactivates when boiled and inactivates in low ionic 
strength solution, and which when removed from a cell culture comprising cells 
persistently infected with the virus prevents survival of at least some cells persistently 
infected with the virus, thereby treating the viral infection. The composition can 
10 comprise, for example, an antibody that specifically binds the serum protein, or an 

antisense RNA that binds an RNA encoded by a gene functionally encoding the serum 
protein 

Any virus capable of infecting the selected subject to be treated can be treated by 
the present method. As described above, any serum protein or survival factor found by 

15 the present methods to be necessary for growth of any one virus can be found to be 
necessary for growth of many other viruses. For any given virus, the serum protein or 
factor can be confirmed to be required for growth by the methods described herein. The 
cellular genes identified by the examples using reo virus, a mammalian pathogen, and a 
rat cell system have general applicability to other virus infections that include all of the 

20 known as well as yet to be discovered human pathogens, including, but not limited to: 
human immunodeficiency viruses {e.g., HIV-1, HIV-2); parvovirus; papillomaviruses; 
hantaviruses; influenza viruses {e.g., influenza A, B and C viruses); hepatitis viruses A 
to G; caliciviruses; astroviruses; rotaviruses; coronaviruses, such as human respiratory 
coronavirus; picornaviruses, such as human rhinovirus and enterovirus; ebola virus; 

25 human herpesvirus {e.g* HSV-1-9); human cytomegalovirus; human adenovirus; 

Epstein-Barr virus; hantaviruses; for animal, the animal counterpart to any above listed 
human virus, animal retroviruses, such as simian immunodeficiency virus, avian 
immunodeficiency virus, bovine immunodeficiency virus, feline immunodeficiency virus, 
equine infectious anemia virus, caprine arthritis encephalitis virus or visna virus. 

30 A protein inhibiting amount of the composition can be readily determined, such 

as by administering varying amounts to cells or to a subject and then adjusting the 
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effective amount for inhibiting the protein according to the volume of blood or weight of 
the subject. Compositions that bind to the protein can be readily determined by running 
the putatively bound protein on a protein gel and observing an alteration in the protein's 
migration through the gel. Inhibition of the protein can be determined by any desired 

5 means such as adding the inhibitor to complete media used to maintain persistently 
infected cells and observing the cells' viability. The composition can comprise, for 
example, an antibody that specifically binds the serum protein. Specific binding by an 
antibody means that the antibody can be used to selectively remove the factor from 
serum or inhibit the factors biological activity and can readily be determined by radio 

10 immune assay (RIA), bioassay, or enzyme-linked immunosorbant (ELISA) technology. 
The composition can comprise, for example, an antisense RNA that specifically binds an 
RNA encoded by the gene encoding the serum protein. Antisense RNAs can be 
synthesized and used by standard methods {e.g., Antisense RNA and DNA, D. A. 
Melton, Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1988)). 

15 The present methods provide a method of screening a compound for treating a 

viral infection, comprising administering the compound to a cell containing a cellular 
gene functionally encoding a gene product necessary for reproduction of the virus in the 
cell but not necessary for survival of the cell and detecting level of the gene product 
produced, a decrease or elimination of the gene product indicating a compound for 

20 treating the viral infection. The present methods also provide a method of screening a 
compound for effectiveness in treating a viral infection, comprising administering the 
compound to a cell containing a cellular gene functionally encoding a gene product 
necessary for reproduction of the virus in the cell but not necessary for survival of the 
cell and detecting the level of the gene product produced, a decrease or elimination of 

25 the gene product indicating a compound effective for treating the viral infection. The 
cellular gene can be, for example, any gene provided herein, i.e., any of the genes 
comprising the nucleotide sequences set forth in any of SEQ ID NO:l through SEQ ID 
NO: 75, or any other gene obtained using the methods provided herein for obtaining 
such genes. Level of the gene product can be measured by any standard means, such as 

30 by detection with an antibody specific for the protein The level of gene product can be 
compared to the level of the gene product in a control cell not contacted with the 
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compound. The level of gene product can be compared to the level of the gene product 
in the same cell prior to addition of the compound. Relatedly, the regulatory region of 
the gene can be functionally linked to a reporter gene and compounds can be screened 
for inhibition of the reporter gene. Such reporter constructs are described herein. 
5 The present invention provides a method of selectively eliminating cells 

persistently infected with a virus from an animal cell culture capable of surviving for a 
first period of time in the absence of serum, comprising propagating the cell culture in 
the absence of serum for a second time period which a persistently infected cell cannot 
survive without serum, thereby selectively eliminating from the cell culture cells 

10 persistently infected with the virus. The second time period should be shorter than the 
first time period. Thus one can simply eliminate serum from a standard culture medium 
composition for a period of time (e.g. by removing serum containing medium from the 
culture container, rinsing the cells, and adding serum-free medium back to the 
container), then, after a time of serum starvation, return serum to the culture medium. 

15 Alternatively, one can inhibit a serum survival factor from the culture in place of the step 
of serum starvation. Furthermore, one can instead interfere with the virus-factor 
interaction. Such a viral elimination method can periodically be performed for cultured 
cells to ensure that they remain virus-free. The time period of serum removal can 
greatly vary, with a typical range being about 1 to about 30 days; a preferable period 

20 can be about 3 to about 1 0 days, and a more preferable period can be about 5 days to 
about 7 days. This time period can be selected based upon ability of the specific cell to 
survive without serum as well as the life cycle of the virus, e.g., for reovirus, which has a 
life cycle of about 24 hours, 3 days' starvation of cells provides dramatic results. 

Furthermore, the time period can be shortened by also passaging the cells during 

25 the starvation; in general, increasing the number of passages can decrease the time of 
serum starvation (or serum factor inhibition) needed to get full clearance of the virus 
from the culture. While passaging, the cells typically are exposed briefly to serum 
(typically for about 3 to about 24 hours). This exposure both stops the action of the 
trypsin used to dislodge the cells and stimulates the cells into another cycle of growth, 

30 thus aiding in this selection process. Thus a starvation/serum cycle can be repeated to 
optimize the selective effect. Other standard culture parameters, such as confluency of 
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the cultures, pH, temperature, etc. can be varied to alter the needed time period of 
serum starvation (or serum survival factor inhibition). This time period can readily be 
determined for any given viral infection by simply removing the serum for various 
periods of time, then testing the cultures for the presence of the infected cells (e.g., by 

5 ability to survive in the absence of serum and confirmed by quantitating virus in cells by 
standard virus titration and immunohistochemical techniques) at each tested time period, 
and then detecting at which time periods of serum deprivation the virally infected cells 
were eliminated. It is preferable that shorter time periods of serum deprivation that still 
provide elimination of the persistently infected cells be used. Furthermore, the cycle of 

10 starvation, then adding back serum and determining amount of virus remaining in the 
culture can be repeated until no virtually infected cells remain in the culture. 

Thus, the present method can further comprise passaging the cells, i.e., 
transferring the cell culture from a first container to a second container. Such transfer 
can facilitate the selective lack of survival of virally infected cells. Transfer can be 

15 repeated several times. Transfer is achieved by standard methods of tissue culture {see, 
e.g., Freshney, Culture of Animal Cells, A Manual of Basic Technique, 2nd Ed. Alan R. 

Liss, Inc., New York, 1987). 

The present method further provides a method of selectively eliminating from a 
cell culture cells persistently infected with a virus, comprising propagating the cell 

20 culture in the absence of a functional form of the serum protein having a molecular 
weight of between about 50 kD and 1 00 kD which resists inactivation in low pH and 
resists inactivation by chloroform extraction, which inactivates when boiled and 
inactivates in low ionic strength solution, and which when removed from a cell culture 
comprising cells persistently infected with reovirus substantially prevents survival of 

25 cells persistently infected with reovirus. The absence of the functional form can be 

achieved by any of several standard means, such as by binding the protein to an antibody 
selective for it (binding the antibody in serum either before or after the serum is added to 
the cells; if before, the serum protein can be removed from the serum by, e.g., binding 
the antibody to a column and passing the serum over the column and then administering 

30 the survival protein-free serum to the cells), by administering a compound that 
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inactivates the protein, or by administering a compound that interferes with the 
interaction between the virus and the protein. 

Thus, the present invention provides a method of selectively eliminating from a 
cell culture propagated in serum-containing medium cells persistently infected with a 
5 virus, comprising inhibiting in the serum the protein having a molecular weight of 
between about 50 kD and 100 kD which resists inactivation in low pH and resists 
inactivation by chloroform extraction, which inactivates when boiled and inactivates in 
low ionic strength solution, and which when removed from a cell culture comprising 
cells persistently infected with reovirus substantially prevents survival of cells 
10 persistently infected with reovirus. Alternatively, the interaction between the virus and 
the serum protein can be disrupted to selectively eliminate cells persistently infected with 
the virus. 

Any virus capable of some form of persistent infection may be eliminated from a 
cell culture utilizing the present elimination methods, including removing, inhibiting or 
IS otherwise interfering with a serum protein, such as the one exemplified herein, and also 
including removing, inhibiting or otherwise interfering with a gene product from any 
cellular gene found by the present method to be necessary for viral growth yet 
nonessential to the cell. For example, DNA viruses or RNA viruses can be targeted. 
One can readily determine whether cells infected with a selected virus can be selectively 

20 removed from a culture through removal of serum by starving cells permissive to the 
virus of serum (or inhibiting the serum survival factor), adding the selected virus to the 
cells, adding serum to the culture, and observing whether infected cells die (i.e., by 
titering levels of virus in the surviving cells with an antibody specific for the virus). 

A culture of any animal cell (i.e., any cell that is typically grown and maintained 

25 in culture in serum) that can be maintained for a period of time in the absence of serum, 
can be purified from viral infection utilizing the present method. For example, primary 
cultures as well as established cultures and cell lines can be used. Furthermore, cultures 
of cells from any animal and any tissue or cell type within that animal that can be 
cultured and that can be maintained for a period of time in the absence of serum can be 

30 used. For example, cultures of cells from tissues typically infected, and particularly 
persistently infected, by an infectious virus could be used. 
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As used in the claims "in the absence of serum" means at a level at which 
persistently virally infected cells do not survive. Typically, the threshold level is about 
1% serum in the media. Therefore, about 1% serum or less can be used, such as about 
1%, 0.75%, 0.50%. 0.25% 0. 1% or no serum can be used. 
5 As used herein, "selectively eliminating" cells persistently infected with a virus 

means that substantially all of the cells persistently infected with 1 the virus are killed such 
that the presence of virally infected cells cannot be detected in the culture immediately 
after the elimination procedure has been performed. Furthermore, "selectively 
eliminating" includes that cells not infected with the virus are generally not killed by the 
10 method. Some surviving cells may still produce virus but at a lower level, and some 
may be defective in pathways that lead to death by the virus. Typically, for cells 
persistently infected with virus to be substantially all killed, more than about 90% of the 
cells, and more preferably less than about 95%, 98%, 99%, or 99.99% of virus- 
containing cells in the culture are killed. 
15 The present method also provides a nucleic acid comprising the regulatory 

region of any of the genes. Such regulatory regions can be isolated from the genomic 
sequences isolated and sequenced as described above and identified by any 
characteristics observed that are characteristic for regulatory regions of the species and 
by their relation to the start codon for the coding region of the gene. The present 
20 invention also provides a construct comprising the regulatory region functionally linked 
to a reporter gene. Such constructs are made by routine subcloning methods, and many 
vectors are available into which regulatory regions can be subcloned upstream of a 
marker gene. Marker genes can be chosen for ease of detection of marker gene product 
The present method therefore also provides a method of screening a compound 
25 for treating a viral infection, comprising administering the compound to a cell containing 
any of the above-described constructs, comprising a regulatory region of one of the 
genes comprising the nucleotide sequence set forth in any of SEQ ID NO:l through 
SEQ ID NO:75 functionally linked to a reporter gene, and detecting the level of the 
reporter gene product produced, a decrease or elimination of the reporter gene product 
30 indicating a compound for treating the viral infection. Compounds detected by this 
method would inhibit transcription of the gene from which the regulatory region was 
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isolated, and thus, in treating a subject, would inhibit the production of the gene product 
produced by the gene, and thus treat the viral infection. 

The present invention additionally provides a method of reducing or inhibiting a 
viral infection in a subject, comprising administering to the subject an amount of a 
5 composition that inhibits expression or functioning of a gene product encoded by a gene 
comprising the nucleic acid set forth in any of SEQ ID NO:l through SEQ ID NO:75, 
or a homolog thereof, thereby treating the viral infection, the composition can comprise, 
for example, an antibody that binds a protein encoded by the gene. The composition 
can also comprise an antibody that binds a receptor for a protein encoded by the gene. 
10 Such an antibody can be raised against the selected protein by standard methods, and 
can be either polyclonal or monoclonal, though monoclonal is preferred. Alternatively, 
the composition can comprise an antisense RNA that binds an RNA encoded by the 
gene. Furthermore, the composition can comprise a nucleic acid functionally encoding 
an antisense RNA that binds an RNA encoded by the gene. Other useful compositions 

15 will be readily apparent to the skilled artisan. 

The present invention further provides a method of reducing or inhibiting a viral 
infection in a subject comprising mutating ex vivo in a selected cell from the subject an 
endogenous gene comprising the nucleic acid set forth in any of SEQ ID NO: 1 through 
SEQ ID NO: 75, or a homolog thereof, to a gene form incapable of producing a 

20 functional gene product of the gene or a gene form producing a reduced amount of a 
functional gene product of the gene, and replacing the cell in the subject, thereby 
reducing viral infection of cells in the subject. The cell can be selected according to the 
typical target cell of the specific virus whose infection is to be reduced, prevented or 
inhibited. A preferred cell for several viruses is a hematopoietic cell. When the selected 

25 cell is a hematopoietic cell, viruses which can be reduced or inhibited from infection can 
include, for example, HIV, including HIV-1 and HIV-2. 

The present invention also provides a method of reducing or inhibiting a viral 
infection in a subject comprising mutating ex vivo in a selected cell from the subject an 
endogenous gene comprising a nucleic acid isolated by a method comprising 

30 (a) transferring into a cell culture growing in serum-containing medium a vector 
encoding a selective marker gene lacking a functional promoter, (b) selecting cells 
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expressing the marker gene, (c) removing serum from the culture medium, (d) 
infecting the cell culture with the virus, and (e) isolating from the surviving cells a 
cellular gene within which the marker gene is inserted, 

to a mutated gene form incapable of producing a functional gene product of the gene or 

5 to a mutated gene form producing a reduced amount of a functional gene product of the 
gene, and replacing the cell in the subject, thereby reducing viral infection of cells in the 
subject. Thus the mutated gene form can be one incapable of producing an effective 
amount of a functional protein or mRNA, or one incapable of producing a functional 
protein or mRNA, for example. The method can be performed wherein the virus is 

10 HIV. The method can be performed in any selected cell in which the virus may infect 
with deleterious results. For example, the cell can be a hematopoietic cell. However, 
many other virus-cell combinations will be apparent to the skilled artisan. (Dr. Rubin: 
any other virus-cell relationships particularly good targets for this method?] 

The present invention additionally provides a method of increasing viral infection 

15 resistance in a subject comprising mutating ex vivo in a selected cell from the subject an 
endogenous gene comprising a nucleic acid isolated by a method comprising 
(a) transferring into a cell culture growing in serum-containing medium a vector 
encoding a selective marker gene lacking a functional promoter, (b) selecting cells 
expressing the marker gene, (c) removing serum from the culture medium, (d) 

20 infecting the cell culture with the virus, and (e) isolating from the surviving cells a 
cellular gene within which the marker gene is inserted, 

to a mutated gene form incapable of producing a functional gene product of the gene or 
a gene form producing a reduced amount of a functional gene product of the gene, and 
replacing the cell in the subject, thereby reducing viral infection of cells in the subject. 

25 The virus can be HIV, particularly when the cell is a hematopoietic cell. However, many 
other virus-cell combinations will be apparent to the skilled artisan. 

The present invention provides a method of identifying a cellular gene that can 
suppress a malignant phenotype in a cell, comprising (a) transferring into a cell culture 
incapable of growing well in soft agar or Matrigel a vector encoding a selective marker 

30 gene lacking a functional promoter, (b) selecting cells expressing the marker gene, and 
(c) isolating from selected cells which are capable of growing in soft agar or Matrigel a 
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cellular gene within which the marker gene is inserted, thereby identifying a gene that 
can suppress a malignant phenotype in a cell. This method can be performed using any 
selected non-transformed cell line, of which many are known in the art. 

The present invention additionally provides a method of identifying a cellular 
5 gene that can suppress a malignant phenotype in a cell, comprising (a) transferring into 
a cell culture of non-transformed cells a vector encoding a selective marker gene lacking 
a functional promoter, (b) selecting cells expressing the marker gene, and (c) isolating 
from selected and transformed cells a cellular gene within which the marker gene is 
inserted, thereby identifying a gene that can suppress a malignant phenotype in a cell A 
10 non-transformed phenotype can be determined by any of several standard methods in the 
art, such as the exemplified inability to grow in soft agar, or inability to grow in 
Matrigel 

The present invention further provides a method of screening for a compound 
for suppressing a malignant phenotype in a cell comprising administering the compound 

15 to a cell containing a cellular gene functionally encoding a gene product involved in 
establishment of a malignant phenotype in the cell and detecting the level of the gene 
product produced, a decrease or elimination of the gene product indicating a compound 
effective for suppressing the malignant phenotype. Detection of the level, or amount, of 
gene product produced can be measured, directly or indirectly, by any of several 

20 methods standard in the art (e.g., protein gel, antibody-based assay, detecting labeled 
RNA) for assaying protein levels or amounts, and selected based upon the specific gene 
product. 

The present invention further provides a method of suppressing a malignant 
phenotype in a cell in a subject, comprising administering to the subject an amount of a 

25 composition that inhibits expression or functioning of a gene product encoded by a gene 
comprising the nucleic acid set forth in SEQ ID NO: 76, SEQ ID NO:77, SEQ ID 
NO:78, SEQ ID NO:79, SEQ ID NO 80, SEQ ID NO:81, SEQ ID NO:82 or SEQ ID 
NO:83, or a homolog thereof, thereby suppressing a malignant phenotype. The 
composition can, for example, comprise an antibody that binds a protein encoded by the 

30 gene. The composition can, as another example, comprise an antibody that binds a 

receptor for a protein encoded by the gene. The composition can comprise an antisense 
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RNA that binds an RNA encoded by the gene. Further, the composition can comprise a 
nucleic acid functionally encoding an antisense RNA that binds an RNA encoded by the 
gene. 

Diagnostic or therapeutic agents of the present invention can be administered to 
5 a subject or an animal model by any of many standard means for administering 

therapeutics or diagnostics to that selected site or standard for administering that type of 
functional entity. For example, an agent can be administered orally, parenterally (e.g., 
intravenously), by intramuscular injection, by intraperitoneal injection, topically, 
transdermally, or the like. Agents can be administered, e.g. , as a complex with cationic 
10 liposomes, or encapsulated in anionic liposomes. Compositions can include various 

amounts of the selected agent in combination with a pharmaceutically acceptable carrier 
and, in addition, if desired, may include other medicinal agents, pharmaceutical agents, 
carriers, adjuvants, diluents, etc. Parental administration, if used, is generally 
characterized by injection. Injectables can be prepared in conventional forms, either as 
15 liquid solutions or suspensions, solid forms suitable for solution or suspension in liquid 
prior to injection, or as emulsions Depending upon the mode of administration, the 
agent can be optimized to avoid degradation in the subject, such as by encapsulation, 
etc. 

Dosages will depend upon the mode of administration, the disease or condition 
20 to be treated, and the individual subject's condition, but will be that dosage typical for 
and used in administration of antiviral or anticancer agents. Dosages will also depend 
upon the composition being administered, e.g.. a protein or a nucleic acid. Such 
dosages are known in the art. Furthermore, the dosage can be adjusted according to 
the typical dosage for the specific disease or condition to be treated. Furthermore, 
25 viral titers in culture cells of the target cell type can be used to optimize the dosage for 
the target cells in vivo, and transformation from varying dosages achieved in culture 
cells of the same type as the target cell type can be monitored. Often a single dose can 
be sufficient; however, the dose can be repeated if desirable. The dosage should not be 
so large as to cause adverse side effects. Generally, the dosage will vary with the 
30 age, condition, sex and extent of the disease in the patient and can be determined by 
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one of skill in the art. The dosage can also be adjusted by the individual physician in 
the event of any complication. 

For administration to a cell in a subject, the composition, once in the subject, will 
of course adjust to the subject's body temperature. For ex vivo administration, the 
5 composition can be administered by any standard methods that would maintain viability 
of the cells, such as by adding it to culture medium (appropriate for the target cells) and 
adding this medium directly to the cells. As is known in the art, any medium used in this 
method can be aqueous and non-toxic so as not to render the cells non-viable. In 
addition, it can contain standard nutrients for maintaining viability of cells, if desired. 

10 For in vivo administration, the complex can be added to, for example, a blood sample or 
a tissue sample from the patient, or to a pharmaceutical^ acceptable carrier, e.g., saline 
and buffered saline, and administered by any of several means known in the art. 
Examples of administration include parenteral administration, e.g., by intravenous 
injection including regional perfusion through a blood vessel supplying the tissues(s) or 

15 organ(s) having the target cell(s), or by inhalation of an aerosol, subcutaneous or 

intramuscular injection, topical administration such as to skin wounds and lesions, direct 
transfection into, e.g., bone marrow cells prepared for transplantation and subsequent 
transplantation into the subject, and direct transfection into an organ that is subsequently 
transplanted into the subject. Further administration methods include oral 

20 administration, particularly when the composition is encapsulated, or rectal 
administration, particularly when the composition is in suppository form. A 
pharmaceutical^ acceptable carrier includes any material that is not biologically or 
otherwise undesirable, i.e., the material may be administered to an individual along with 
the selected complex without causing any undesirable biological effects or interacting in 

25 a deleterious manner with any of the other components of the pharmaceutical 
* composition in which it is contained. 

Specifically, if a particular cell type in vivo is to be targeted, for example, by 
regional perfusion of an organ or tumor, cells from the target tissue can be biopsied and 
optimal dosages for import of the complex into that tissue can be determined in vitro, as 

30 described herein and as known in the art, to optimize the in vivo dosage, including 



SNSDOCID: -eWO 97391 19A1> 



WO 97/39119 PCT/US97/06067 

28 

concentration and time length. Alternatively, culture cells of the same cell type can also 
be used to optimize the dosage for the target cells in vivo. 

For either ex vivo or in vivo use, the complex can be administered at any 
effective concentration. An effective concentration is that amount that results in 

5 reduction, inhibition or prevention of the viral infection or in reduction or inhibition of 
transformed phenotype of the cells 

A nucleic acid can be administered in any of several means, which can be 
selected according to the vector utilized, the organ or tissue, if any, to be targeted, and 
the characteristics of the subject. The nucleic acids, if desired in a pharmaceutical^ 

10 acceptable carrier such as physiological saline, can be administered systemically, such as 
intravenously, intraarterially, orally, parenterally, subcutaneously. The nucleic acids can 
also be administered by direct injection into an organ or by injection into the blood 
vessel supplying a target tissue. For an infection of cells of the lungs or trachea, it can 
be administered intratracheal^. The nucleic acids can additionally be administered 

15 topically, transdermally, etc. 

The nucleic acid or protein can be administered in a composition. For example, 
the composition can comprise other medicinal agents, pharmaceutical agents, carriers, 
adjuvants, diluents, etc. Furthermore, the composition can comprise, in addition to the 
vector, lipids such as liposomes, such as cationic liposomes (e.g., DOTMA, DOPE, 

20 DC-cholesterol) or anionic liposomes. Liposomes can further comprise proteins to 
facilitate targeting a particular cell, if desired. Administration of a composition 
comprising a vector and a cationic liposome can be administered to the blood afferent to 
a target organ or inhaled into the respiratory tract to target cells of the respiratory tract. 
Regarding liposomes, see, e.g., Brigham et al. Am. J. Resp. Cell Mol Biol 1:95-100 

25 (1989); Feigner et al. Proa Nail Acad Sci USA 84:7413-7417 (1987); U.S. Pat. 
No.4,897,355. 

For a viral vector comprising a nucleic acid, the composition can comprise a 
pharmaceutical^ acceptable carrier such as phosphate buffered saline or saline. The 
viral vector can be selected according to the target cell, as known in the art. For 
30 example, adenoviral vectors, in particular replication-deficient adenoviral vectors, can be 
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utilized to target any of a number of cells, because of its broad host range. Many other 

viral vectors are available, and their target cells known.. 

EXAMPLES 

Selective elimination of virallv infected cells from a cell culture 

5 Rat intestinal cell line-1 cells (RIE-1 cells) were standardly grown in Dulbecco's 

modified eagle's medium, high glucose, supplemented with 10% fetal bovine serum. To 
begin the experiment, cells persistently infected with reovirus were grown to near 
confluence, then serum was removed from the growth medium by removing the 
medium, washing the cells in PBS, and returning to the flask medium not supplemented 

10 with serum. Typically, the serum content was reduced to 1% or less. The cells are 
starved for serum for several days, or as long as about a month, to bring them to 
quiescence or growth arrest. Media containing 10% serum is then added to the 
quiescent cells to stimulate growth of the cells. Surviving cells are found to not to be 
persistently infected cells by immunohistochemical techniques used to establish whether 

IS cells contain any infectious virus (sensitivity to 1 infectious virus per ml of homogenized 
cells). 

Cellular Genomic DNA Isolation 

Gene Trap Libraries: The libraries are generated by infecting the RIE-1 cells 
20 with a retrovirus vector (U3 gene-trap) at a ratio of less than one retrovirus for every 
ten cells. When a U3 gene trap retrovirus integrates within an actively transcribed gene, 
the neomycin resistance gene that the U3 gene trap retrovirus encodes is also 
transcribed, this confers resistance to the cell to the antibiotic neomycin. Cells with gene 
trap events are able to survive exposure to neomycin while cells without a gene trap 
25 event die. The various cells that survive neomycin selection are then propagated as a 
library of gene trap events. Such libraries can be generated with any retrovirus vector 
that has the properties of expressing a reporter gene from a transcriptionally active 
cellular promoter that tags the gene for later identification. 

Reovirus selection: Reovirus infection is typically lethal to RIE-1 cells but can 
30 result in the development of persistently infected celts. These ceils continue to grow 
while producing infective reovirus particles. For the identification of gene trap events 
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that confer reovirus resistance to cells, the persistently infected cells must be eliminated 
or they will be scored as false positives. We have found that RIE-1 cells persistently 
infected with reovirus are very poorly tolerant to serum starvation, passaging and plating 
at low density. Thus, we have developed protocols for the screening of the RIE-1 gene 
5 trap libraries that select against both reovirus sensitive cells and cells that are persistently 
infected with reovirus. 

1 RIE- 1 library cells are grown to near confluence and then the serum is removed 
from the media The cells are starved for serum for several days to bring them to 
quiescent or growth arrest 
10 2. The library cells are infected with reovirus at a titer of greater than ten reovirus 

per cell and the serum starvation is continued for several more days. 

3. The infected cells are passaged, (a process in which they are exposed to serum 
for three to six hours) and then starved for serum for several more days. 

4. The surviving cells are then allowed to grow in the presence of serum until 
15 visible colonies develop at which point they are cloned by limiting dilution. 

MEDIA: DULBECCO S MODIFIED EAGLE'S MEDIUM, HIGH GLUCOSE 
(DME/HIGH) Hyclone Laboratories cat. no. SH30003.02. 

NEOMYCIN: The antibiotic used to select against the cells that did not have a U3 gene 
trap retrovirus. We used GENETICIN, from Sigma, cat. no. G9516. 
20 RAT INTESTINAL CELL LINE- 1 CELLS (RIE- 1 CELLS): These cells are from the 
laboratory of Dr. Ray Dubois (VAMC). They are typically cultured in Dulbecco's 
Modified Eagle's Medium supplemented with 10% fetal calf serum. 
REOVIRUS: Laboratory strains of either serotype 1 or serotype 3 are used. They were 
originally obtained from the laboratories of Bernard N. Fields (deceased). These viruses 

25 have been described in detail. 

RETROVIRUS: The U3 gene trap retrovirus used here were developed by Dr Earl 
Ruley (VAMC) and the libraries were produced using a general protocol suggested by 
him. 

SERUM: FETAL BOVINE SERUM Hyclone Laboratories cat. no. A-l 1 15-L. 
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Characteristics of some of the isolated sequences include the following: 
SEQ ID NO: 1- rat genomic sequence of vacuolar H+ATPase (chemically inhibiting the 
activity of the gene product results in resistance to influenza virus and reovirus) 
SEQ ID NO 2- rat alpha tropomyosin genomic sequence 
5 SEQ ID NO:3- rat genomic sequence of murine and rat gas5 gene (cell cycle regulated 
gene) 

SEQ ID NO:4- rat genomic sequence of pi 62 of ras complex , mouse, human (cell 
cycle regulated gene) 

SEQ ID NO: 5- similar to N-acetyl-glucosaminyltransferase I mRNA, mouse, human 
10 (enzyme located in the Golgi region in the cell; has been found as part of a DNA 
containing virus) 

SEQ ID NO:6- similar to calcyclin, mouse, human, reverse complement (cell cycle 
regulated gene) 

SEQ ID NO:7- contains sequence similar to :LOCUS AA254809 364 bp mRNA EST 
15 DEFINITION mz75a!0.rl Scares mouse lymph node NbMLN Mus musculus cDNA 
clone 719226 5' 

SEQ ID NO:8- contains a sequence similar to No SW:RSPl_MOUSE Q01730 RSP-I 
PROTEIN 

SEQ ID NO:9- contains 5* UTR of gb|U25435 (HSU25435 Human transcriptional 
20 repressor (CTCF) mRNA, complete cds, Length = 3780 
SEQ ID NO:38- similar to cDNA of retroviral origin 
SEQ ID NO: 50- trapped AYU-6 genetic element 

Isolation of cellular genes that suppress a malignant phenotvpe 
25 We have utilized a gene-trap method of selecting cell lines that have a 

transformed phenotype (are potentially tumor cells) from a population of cells (RIE-1 
parentals) that are not transformed. The parental cell line, RIE-1 cells, does not have 
the capacity to grow in soft agar or to produce tumors in mice. Following gene- 
trapping, cells were screened for their capacity to grow in soft agar. These cells were 
30 cloned and genomic sequences were obtained 5' or 3* of the retrovirus vector (SEQ ID 
NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID 
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NO:81, SEQ ID NO:82, SEQ ID NO:83). All of the cell lines behave as if they are 
tumor cell lines, as they also induce tumors in mice. 

Of the cell lines, two are associated with the enhanced expression of the 
prostaglandin synthetase gene II or COX 2. The COX 2 gene has been found to be 

5 increased in pre-malignant adenomas in humans and overexpressed in human colon 
cancer. Inhibitors of COX 2 expression also arrests the growth of the tumor. One of 
the cell lines, xl8 (SEQ ID NO:76), has disrupted a gene that is now represented in the 
EST (dbest) database, but the gene is not known (not present in GenBank). 
(SEQ ID NO:76): >02-X18H-t7... identical to: gb|W55397|W55397 mbl3h04.rl Life 

1 0 Tech mouse brain Mus at 1 . Oe- 1 1 4 . x 1 8 has also been sequenced from the vector with 
the same EST being found. (SEQ ID NO:77): >x8_b4_2. (SEQ ID NO:78): 
>x7_b4.. (SEQ ID NO:79): >x4-b4.. (SEQ ID NO:80): >x2-b4... (SEQ ID NO:81): 
>x!5-b4 (SEQ ID NO:82): >xl3-re.., reverse complement. (SEQ ID NO:83): 
>xl2_b4.. 

15 

Each of the genes from which the provided nucleotide sequences is isolated 
represents a tumor suppressor gene. The mechanism by which the disrupted genes other 
than the gene comprising the nucleic acid which sequence is set forth in SEQ ID NO:76 
may suppress a transformed phenotype is at present unknown. However, each one 
20 represents a tumor suppressor gene that is potentially unique, as none of the genomic 
sequences correspond to a known gene. The capacity to select quickly tumor 
suppressor genes may provide unique targets in the process of treating or preventing 
(potential for diagnostic testing) cancer. 



25 Isolation of entire ge nomic genes 

An isolated nucleic acid of this invention (whose sequence is set forth in any of 
SEQ ID NO: 1 through SEQ ID NO: 83), or a smaller fragment thereof, is labeled by a 
detectable label and utilized as a probe to screen a rat genomic library (lambda phage or 
yeast artificial chromosome vector library) under high stringency conditions, i.e., high 

30 salt and high temperatures to create hybridization and wash temperature 5-20°C. 
Clones are isolated and sequenced by standard Sanger dideoxynucleotide sequencing 
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methods. Once the entire sequence of the new clone is determined, it is aligned with the 
probe sequence and its orientation relative to the probe sequence determined. A second 
and third probe is designed using sequences from either end of the combined genomic 
sequence, respectively. These probes are used to screen the library, isolate new clones, 
5 which are sequenced. These sequences are aligned with the previously obtained 

sequences and new probes designed corresponding to sequences at either end and the 
entire process repeated until the entire gene is isolated and mapped. When one end of 
the sequence cannot isolate any new clone, a new library can be screened. The complete 
sequence includes regulatory regions at the 5' end and a polyadenylation signal at the 3' 
10 end. 

Isolation of cDNAs 

An isolated nucleic acid (whose sequence is set forth in any of SEQ ID NO l . 

through SEQ ID NO:83, and preferably any of SEQ ID NO:5 through SEQ ID NO:83), 
IS or a smaller fragment thereof, or additional fragments obtained from the genomic 

library, that contain open reading frames, is labeled by a detectable label and utilized as 

a probe to screen a portions of the present fragments, to screen a cDNA library. A rat 

cDNA library obtains rat cDNA; a human cDNA library obtains a human cDNA. 

Repeated screens can be utilized as described above to obtain the complete coding 
20 sequence of the gene from several clones if necessary. The isolates can then be 

sequenced to determine the nucleotide sequence by standard means such as 

dideoxynucleotide sequencing methods. 

Serum survival factor isolation and characterization 

25 The lack of tolerance to serum starvation is due to the acquired dependence of 

the persistently infected cells for a serum factor (survival factor) that is present in serum. 
The serum survival factor for persistently infected cells has a molecular weight between 
SO and 100 kD and resists inactivation in low pH (pH2) and chloroform extraction. It is 
inactivated by boiling for 5 minutes [once fractionated from whole serum (SO to 1 00 kD 

30 fraction)], and in low ionic strength solution [10 to 50 mM]. 



* 
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The factor was isolated from serum by size fraction using centriprep molecular 
cut-off filters with excluding sizes of 30 and 100 kd (Millipore and Amnicon), and 
dialysis tubing with a molecular exclusion of 50 kd. Polyacrylamide gel electrophoresis 
and silver staining was used to determine that all of the resulting material was between 

5 50 and 100 kd, confirming the validity of the initial isolation. Further purification was 
performed on using ion exchange chromatography, and heparin sulfate adsorption 
columns, followed by HPLC. Activity was determined following adjusting the pH of the 
serum fraction (30 to 100 kd fraction) to different pH conditions using HC1 and 
readjusting the pH to pH 7 .4 prior to assessment of biologic activity. Low ionic 

10 strength sensitivity was determined by dialyzing the fraction containing activity into low 
ionic strength solution for various lengths of time and readjusting ionic strength to 
physiologic conditions prior to determining biologic activity by dialyzing the fraction 
against the media. The biologic activity was maintained in the aqueous solution 
following chloroform extraction, indicating the factor is not a lipid. The biologic activity 

15 was lost after the 30 to 100 kd fraction was placed in a 100°C water bath for 5 minutes. 

Isolated nucleic acids 

Tagged genomic DIAS isolated were sequenced by standard methods using 
Sanger dideoxynucleotide sequencing. The nucleotide sequences of these nucleic acids 

20 are set forth herein as SEQ ID NO: 1 through SEQ ID NO:75 (viral infection genes) and 
SEQ ID NO:76 through SEQ ID NO:83 (tumor suppressor genes). The sequences were 
run through computer databanks in a homology search. Sequences for some of the 
"6b" sequences [obtained from genomic library 6, flask b] (/.*., SEQ ID NO:37, 38, 39, 
42, 61, 65, 66, 69) correspond to a known gene, alpha tropomyosin, and some of the 

25 others correspond to the vacuolar-H*-ATPase. These sequences are associated with 
both acute and persistent viral infection and the cellular genes which comprise them. f. t 
alpha tropomyosin and vacuolar-H 4 -ATPase, can be targets for drug treatments for 
viral infection using the methods described above. These genes can be therapy targets 
particularly because disruption of one or both alleles results in a viable cell. 

30 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: VANDERBILT UNIVERSITY 

305 Kirkland Hall 
Nashville, TN 37240 

(ii) TITLE OF INVENTION: MAMMALIAN GENES INVOLVED IN VIRAL 
INFECTION 

(iii) NUMBER OF SEQUENCES: 83 

(XV) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Needle & Rosenberg, P.C. 

(B) STREET: 127 Peachtree Street, Suite 1200 

(C) CITY: Atlanta 

(D) STATE: Georgia 

(E) COUNTRY: USA 

(F) ZIP: 30303-1811 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Selby, Elizabeth 

(B) REGISTRATION NUMBER: 38,298 

(C) REFERENCE/ DOCKET NUMBER: 22000. 0061/P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404 688 0770 

(B) TELEFAX: 404 688 9880 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 828 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AAAAAAAAAT TAC CATTTTT GGGNGAACCT TTNATANTTN GTT CCTAGAG GGNGAGT CAG 60 

GGGTAAAAAA AACGATNAAG GGAGTTGNGG CGATTGGAGA AGCTATTATG AAGGGATAAA 120 

ANACTTAGGT TGAGCCGGCG GGTGGGGTGT ATTCTTGGGG TGGNGAAAAG NNAGATCAAC 180 

ATGAGATTTT TTTGTTTTAG GTTTTGCATG TTGTAATGCA ATANTTTAAC CTGATTTTAT 240 
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GTGCAGGATG CCTGAGGTTT GTGAGCAGGA ACACAGGAAA AGGAACACCG GTANTCGAAC 300 

ACCGGTGAGT CCGCGCAGCC GCAGAGAAGG CGGGT AT CAT TCGNTCCACC CTGTATGNTA 360 

ATATGGAGCG CTACGGCCCC GCCCCTGGGG CCGAT GGGCC CAAAAAGGTA GGGTTCGAGA 420 

AGACGT CTGC AT GGAGCAGT GGACCAGTGA AGACCCAGGC AAGGCCGAAC GTTGGGCCCC 480 

GGGCCCCGGG GGCGGGTAGC AGGGC CCATA CATTGTCCAA GGGCTGCTGG AGAGCCTGGA 540 

GCCTCGCTCC CCCACCGGCG CAAAGT GGTA CAGCCCATGG GGGCGTGGCC CAT AT CATGG 600 

ACGCGAGCGC GGCCGCCATC TTGNTCTGCG GTGCTGGTAT TTAGAGCGCA GCGCCTGACT 660 

GGCGGGGTCG CCTTCGCATC CGCCGCTTCG AGAATCTTCT TTCGTCTGCT CGCTCTCTCT 720 

CCCGTCGTCC TAGCCCGCCG CCGCCTGCTG AGCTTGCCCT CTTCCCCGCT TGCAGACATG 7 80 

GNGGACATTG AAAGACCCTA CCTNAAGGGC CNGCANGCNA GAAAAAGT 828 
(2) INFORMATION FOR SEQ ID NO: 2: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 845 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TCNCCTAAGA NANGAGANAG GTTAGATGGN AATGGAGANT ANATACCGGG CTTAGCTTCG 60 

CCNNGGACCC ACCNAGGGGA AAAGAGCCNT CNNGCAACAA ACNAAAGGAN CGGAAAGAGG 120 

AAGGGNANGN GGNNAAACAN ATTGGGCGAA TTTAAAANCT NNGNCCNGTT TGAAATAGNG 180 

CNCGGCCGNT CCNTGGGCCN GATCCANCCT TCCNTNACTT TTCNTCCCCN GCNTTAAATT 240 

GCGNCGNCGG CCCCCCCAAC CATNTNTTCC GTTTTNANCA CCNGNGGCCC CGGCAGTGCN 300 

GATGNNGGGG AATTGNNAAT GCCCCCCANC CATTTTGNNT CNGNNCCTGG GGAGAGANTN 360 

AAACGGTGNG NGNAGNNGTT AATATGGCGG CAGCGGNGAC ANCAGTAGCC AGNGCAGGCA 420 

CGCGNAGTTG GCNGGGGACG CCANGTGNCN GGAGANNTGG AGCGGCGGCG GAGCGGGCNC 480 

CNAAAAAAAA AAANAANNGN TGGTAAGGGG GCCCGGGGTG GANGANATTT CNNGGGCNGC 540 

TTCTAGGNGT CANGNTGNGG CCGCTNCGTT CGGCCCTGGA TGNAGCCCNG NGCCNGTGCC 600 

NCCNCCGGGG GGAGTTTGTT TCCNTCTACC GTNCCCTGCT GNGGAGCGAC GANCT GCANT 660 

CCCCNGGAGC GTCTANNAGG CCGTGGCNAA CCCCATCNAN GCNCNCCAGT NAGCTTCCTT 720 

CNTCCCGACA TAGTAGGCGT CNGGNGGCGT TGNCGACAGN GGCCNNCGTC GAT GGGANNN 780 

TCTATTTNNG NTTCATGGGC CGTATGTTAG ACCTNTCGAA GGACGCGNNA AATAGATAGG 8 40 
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GGGGG 845 
(2) INFORMATION FOR SEQ ID NO; 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 818 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

TACACCTTTG NGNGTGTTGA AAATT AC GGG GGANANGAAN AAAAANGTAT CCTTTTGGAN 60 

GCCCCGGNCT CTTGTGGAAT TTGTGATTTA CGGCGGNANT CAT AT GATTT CGGAAANAAG 120 

ATAAAGCCNN NCNNNNNGGG GTAGGGAAGA AGGATTTTGN AAACAAANTN T GGGT NT ATA 18 0 

TAANNGTGGG GGG GGGAGNT CATT GAGGNG GGGN GGAAT A TNNAATNTTT TTTTTTTNNT 240 

TNNNNGGCAA GAGGGATGAA GGTAAGGTTA GTATGAAATG GC CNNNCC AG AGAAGTTNGA 300 

TGAAAAAGAT AGT GCCACCA AGAGANATNA TTTGTTATTT TTAACAGTGG GGGGAGGTAG 360 

TTNTAGACCA CCATTTATTA NAACTGAGGC ACAAAGAAGA TGATT GGGGG GCACTTACAG 42 0 

AGTAAGCAGT ATTTACATAA AGATTTNTTC CCCAGGAATN ANGAGGAAGN TGGATAACTG 48 0 

AACAAAGCCA TGTAAGCAGG CTTTTTGGTA TGCATGTGGT CCCATTACAA GGAAT AC CCA 54 0 

ATAAATAGCA AAT GCACACT GC C ATT CAC A AGCAATTGCA GAGAATGGGT GGGGGATGTG 600 

AAACTAAAGA GCTTT GTAGC TGCCTGAGGA GGTGGGTTCT CTATAT CCGT GGGAGCTAGT 660 

GATCCCCCAC AGGTCTTAGC TGGTGCCATG ATTGT GAT CT TAGGCCAGAT TTGATGTCCC 720 

CCACATGGCC GAGTCCGCCA TGGATGCAAC AGGGCAGCTT TATTTGCTGT GGGCNGGTAN 780 

TGAAGGATNT CACAAATGAA CTTGGCAAGT AGAGAGGT 818 
<2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 857 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TGGAAAGANT GNGNTAAAGT TNAGTTNNNA GATATTGANN AANNTNGGGN AAAANAAGGT 60 

GNNNNACAAT CTCNCAANNA TTTNAANGAA GGGGGAATAA ATGNAAANTG GGANTTAAAA 120 
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TV TV TV KTTV /"* f* KI 




NGGTTNAANA 


NAAGGGGGGT 


NTNCCCGTTT 


TTTTTTTAGG 


180 


ATCCTGGGAG 






TTNGNANAAG 


GGNGNTCCTT 


CCCTTCCNGT 


240 


CAGTAAGGGA 


TGGGGCCCTA 


rprp*prp »p TV Kf /■• TV TV 

1111 l.MIM^*r\rv 




TGACAGGAMA 


CC GGT CAGNA 


300 


TTCCGTTAAG 


T ATTT T GACC 




7i *r tw m*t f*r" Rr 

X Vsr 1 IN 1 V-»v» V7w 


nv«r\w\* \* >7 x x \sr 


NGACCTTAAA 


360 


CGCGNCCAGA 


TTNT GC GAAN 


Lj 1 L.M 1111 urvj 


f3 71 TAT GZX f* T GT 
\Jr\r\ 1 unu X 17 X 


TCTAGAPAfT 

X V9 X AUrtvrtw X 


VJ V>^ X X X X X X^^\7 


420 


TCGCAGATNT 


GAC C GCAGAM 


111 v—IM 111 v—O 


wyAVa« ^ X X f\ X w x 


CCGNTGGAGC 


7A.GT GGT GGCC 


480 


GGAGAAAATT 






AC C C AAAGAA 


CACAACT GTT 


CTCGCTGCCC 


540 


GGCACCCATC 


GCCACGTCAG 


CTCACGCTCG 


CGACGCCAGC 


ACGCNTGCGC 


GCAGAGAAAG 


600 


GCGGAGCATG 


CGCAAAGGCC 


TGCNTNTAAC 


ATCCGGGGCT 


CGGGCGGCGG 


CGCTGCCGCC 


660 


GCGAGGGATT 


AANGGGGTCT 


TTCNTTTCNG 


TCTCTGGCCG 


GCTGGGCGCG 


GGCGACTGCT 


720 


GGCGAGGCGC 


GTGGAAGCTC 


GCGATAGTTC 


CCCTCCGCCT 


CCTCTTCCCG 


GTCCAGGCCA 


780 


CTAGGGAGTT 


CGCTGACGCC 


GGGTGAACTG 


AGCGTACCGC 


CTGAAAGACC 


CCACAAGTAG 


840 


GTTT GGCAAG 


TAGAAAG 










857 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 896 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GGGAGAAAGG GGC GACNTTT ATTGGTCCNG GAGNGGGGGG NCAAATGGGT TTTTATCCAN 60 

TTTAACGGGG GGAGGCCCCG GNNGAGGAAT TCCCGGGGGA GGAANAAAAA CAAGATCCGC 120 

NTAAGAGGGN GGGGGTNTCC GNNNTTNTTN GAATNGTGGN GCACCGGGGG GGCAAGGAAG 180 

AGGGTTCCCG GAGAATGGGG NGGATAAAAN GATT GGCAAC TCACCCCGGN TAGTTGTACC 240 

AGGTGTTTTT TTT TTTTTTT TTTGTTCANA AANAGGAAAA TGATTCAAGT TAAAAAAGTA 300 

ATTGGCAAGG AAATTTTTTT CCTANCCTCC TTGAAAAATA GTGGGAACAG GGGTTCCCAA 360 

GGGGAAAGGT CCCCNATTNA ACAAAATGNG TTTCAGNGGA GTGTGGCCCA CCCATTGTGT 420 

NTCCATGGAA GAGTGGCTTT TNTGGNGAAG TTCATTTTCC TTAACCTTNA NNACT GTAAN 480 

GGNTCTTGTG CTTGAGAATA TTGTTGGCCA GCTTTATNGT CTTCATTTNT AANACTATTT 540 

AGACTAGAGT GTTNTAGATT NTAGGTCTTC ANGTTTCCAG TCACCAGTCC TTGGCTTTTT 600 

AGTATGGAAA TCACCAGTAA TGGCAATATA ACATCCCTGC TTCTGTTTCT TAGAAGGCTN 660 
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NATTACAGTG TGTTCAAACT CCGTGTCATT GCAACAGGTT AAACTAACTT TNTACGTAGG 720 

ACATCAGGGT ATTGACATTC TCATC CTAAA GTCAGTTTGT CTGTTTCGAG AGGAGGAACT 780 

GAAGCAGTGG TTCTTTAAGT AACTGACTCA GGGCTTTCCT GCCTGGCGCG CCTGCCAGGC 840 

ATNGTGTAGC ATTGTACTGC ATCTTCTTTG ACCAGTTTCC CCAGGT GAAG AGCCTG 896 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 937 base pairs 
{ B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

GGGCCCCCCC CCCCCNANTT AATTTTNGGG AAGAAAAAAG GGAAAAAANT TTGGGGTCAG 60 

GAAAAANGAA GTTGGNAANC GNNGGGGNGN CAGNATTNGA ANAGTGGGGG ANNTTAATTT 120 

NAGAGGT CCC TTNNTTCCNN GGAAAAGTTT AAAAGGGGTT CAATTAACTT NGGATCNCCA 180 

TTTATCAGAT TACCCGNGNG TCACCTGGGG ACCCTTTACN GGTGGCGGGA CATTNGAAAN 240 

ACATATTAGT CAGATTATAC ATAGCAAANA TAGTTAGGAG CACAANGAAT CATTTAT GGT 300 

GGNGGTCACC ACACAGGAGA TGTATTATCC GCAGTATTAG AGAGTT GAGA AC CAT ATNTT 360 

AGAGATGCGG TAGACTGACT GTTCCCTTTT CGNTTGGAGT GACCTTGCCA TTAGAGGCAA 420 

CAGCATCAGT ATTGTTCCCA GTCCCCNTCA CACT GATTCG AACTTTAAGG ACACT GATCT 4 80 

NT GGCTGGTA GAGGTT CAGC AC ACATAC CA GAGTTACGAG TCACGTGCCA GAAGGGCAAA 540 

CTGAACACGG AATTAGAGGG AACTCGATGT CTCCGGCTTG CACTGGTCTT CTCTTGCANT 600 

AGAATCCTTC ATCCTGCTCC CAGTCCGGAC GTCCAGGCAA CAAGGGCGTG GAAAGTGAGG 660 

GGGCTGGGAG GTGTGTTTGC CTTGCCTCAG GCGNTGGGTG GGGTT GGGGC GTGCCAGCAC 720 

TCCCCTGGGC GGGCNTCACC GATGCTGGCC ACTATAAGGC CAGCCAGACT GCGACACAGT 7 80 

CCATCCCCTC GACCACTCTT TTGGCGCTTC ATTGTCGACG TGTGGTGAGC TCTCACTGGG 840 

GCGTCCCTCT AAGATCTGTC CACTNC CTGG TCTAGGGGTT AAGCNTTTTC CTGCCCTGAA 900 

AGACCCCACA AT GTAGNTTT GGCAAGCTAG CAAAGGT 937 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 888 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AAAAGGGGGC CCCAGCGGNG GGGGGTTGTC CAAGGAATCA AAANGTGGGG NGGGGGGGAA 60 

AAAANTACTT TTAAAAAAGG CNGCCNNANA ATANANGACG TTCNGGGGNG TTTGAAAAAA 120 

GGCCGGAAGC CTCGGACNGG TTTCNNTGTT AGGACAAGGA AAAAGGGNAC GCACNGGGAT 180 

TTCCTTTCCT TATNTTAGCA AATNGCCGGC CAGGAAACCA NCGAGTTGGG NGGGNTTNGG 240 

TTTTCNGTNA AAGGAAAGCA GGGGGGGGAN AAACACGGAN AAAAAGGGAA GAANNGGGTT 300 

NATTNNGGTT AGNAATTGGN TCCCAGAGAG NGC CAAGAAA ATNGGCCTGT CCAAAATTCT 3 60 

TTTTCCCNGC TTTTAAGACA GGCANGATAN TATNNGGCAG CAGGT NATTA CCANAGGTAA 420 

GTAAATTACA AT GG GTAAGG GCTT GGCACA GGCCAGGGTA AGTAGGGCAN GTATGGATGT 480 

T AAACAT T AC CCTTCATCCN GAGGNAGTTA ACACAAGCAT TCNTGGCGGG TCTCACATAT 540 

CCCAAANAAA AATNTTCAAA AGNAGCCCCN TGGGGAACGT TAAGCCAAGC NTANGACTCA 600 

CAAGGGANGA CATGGGCAGG NTAGGGNACA GAATCAGTGN TCAGAGACTC CAGGGGCACC 660 

CCTGATTCCN TTTGNTGTCA CACAGACANT GCTCCAGGGA CAACCTTCCC GGANGTGAGT 720 

AT ANGACTT T CCTGATGGNG ACGCTGCCGT GANGGGACAC TNCCTCGTGG TAGCACACAT 780 

TCCTCAGTCA GCTTCTGAGC CTCAGGGTCC CAGCAGGCAC AGTGGCAANG AC CT C ATT CT 8 40 

TCTCGTCTGT CCCACTGAAA GACNNTCACN AAGGAGCTGG CTAGTAGA 888 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 980 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

AGAAATGAAA AAGAAGGAAA GCTAAAAATA GAT T AT AAGT GTTCTATTTG AAAAAAGAAA 60 

GAAAAAAAAG AAAAAGAACA CAGAGAAGAA TAAAGGAGAA GAAAAAGGAA GAGAAAAAAA 120 

AGAAAGAAAA AACGGAAAAG AAACCTAGAA AATAAAAAAA CAAAGTATCC GATAAGGAAG 18 0 

AGAAAGGAGA AAGACTTACC TAGAGCCCAG AAATAGAGAA ACTAGAACAA AAAATGGAGA 2 40 

AGAAGAGGAG AGAAAAAGGA TTAGAGAGGG TGAGGTAGAA GGAAGAAAAG ACAAGAAAGC 300 

AGAAAAAAAC TAACAAAGAT GCATATAAAC AGAGAGAAGA TGAT TAAGAT TAGAGAAAAA 3 60 
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GACCAAAGAG AGAAGGTAGA CAGGACAAAT AAAACAAAAA CAGGAGGGGA GAAGGGGAAA 420 

GAAGAAAGAG GGCAAAAGCA AAGGAATAAG ATAATAGCAC CAATAGCAGG ACAGTAAAGG 480 

GTAGAGAAGG GACCATTCCC TACCCCATAG GGGGGAACGA CCCCGGAATC AAAATACAAG 540 

GCACCGAGCT GAACCT GGTT ATCACACAGG CAGGAGT GGT ATAGCACGGC GTTCCGGGCA 600 

AAAAAAAAAA TGAAAAATAA ATTCCTTCGG GCGGAGAACT AGAAGAGGAT GGGAACTCCT 660 

TGACAGAAGT AGCAGGCAGG AAGCCAGCCA GCACCCCAGC CCAAACAGAA GCAGCCGCAA 72 0 

TGAAACGGGC GGCAGATCCA CATCCGCAAA GTCCTCAAGG GAGCATCGGC GAGGCCCGGA 780 

GCCAATGAGG AAGGGCAGGA AACCATATCA AGCCGAGCGT CGGGAC GGCT GCCATGAGAC 840 

AC C CGGAGAG GTAATTTTTT TTTTACGGGA AGCGTCCAGC CAAGTTAGTG GGC CGGAAGC 900 

GACGGTACTT TAGTATACAT CGTTTTGCCC GAGTGGTCAG ATTCTTTTGT TATCCCCAAC 960 

AGAAC CGTAA GCTAGAAATA 98 0 
(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 845 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inea r 

<ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TCNCCTAAGA NANGAGANAG GTTAGATGGN AATGGAGANT ANATACCGGG CTTAGCTTCG 60 

CCNNGGACCC ACCNAGGGGA AAAGAGC CNT CNNGCAACAA ACNAAAGGAN CGGAAAGAGG 120 

AAGGGNANGN GGNNAAACAN AT T GGGCGAA TTTAAAANCT NNGNCCNGTT TGAAATAGNG 180 

CNCGGCCGNT CCNTGGGCCN GATCCANCCT TCCNTNACTT TTCNTCCCCN GCNTTAAATT 2 40 

GCGNCGNCGG CCCCCCCAAC CATNTNTTCC GTTTTNANCA CCNGNGGCCC CGGCAGTGCN 300 

GAT GNNGGGG AATTGNNAAT GCCCCCCANC CATTTT GNNT CNGNNCCTGG GGAGAGANTN 360 

AAACGGTGNG NGNAGNNGTT AATATGGCGG CAGCGGNGAC ANCAGTAGCC AGNGCAGGCA 420 

CGCGNAGTTG GCNGGGGACG CCANGTGNCN GGAGANNTGG AGCGGCGGCG GAGCGGGCNC 480 

CNAAAAAAAA AAANAANNGN TGGTAAGGGG GCCCGGGGTG GANGANATTT CNNGGGCNGC 54 0 

TTCTAGGNGT CANGNTGNGG CCGCTNCGTT CGGCCCTGGA TGNAGCCCNG NGCCNGTGCC 600 

NCCNCCGGGG GGAGTTTGTT TCCNTCTACC GTNCCCTGCT GNGGAGCGAC GAN CT GC ANT 660 

CCCCNGGAGC GTCTANNAGG CCGTGGCNAA CCCCATCNAN GCNCNC CAGT NAGCTTCCTT 720 

CNTCCCGACA TAGTAGGCGT CNGGNGGCGT TGNCGACAGN GGCCNNCGTC GAT GGGANNN 780 
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TCTATTTNNG NTTCATGGGC CGTATGTTAG ACCTNTCGAA GGACGCGNNA AATAGATAGG 840 

845 

GGGGG 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 8 base pairs 
{ B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGATTTNNTA ACCTTTCNGG GAAGGGNGNG GAAAAGGNGC CAAACAAAAA GACCCCNNTG 60 

CCCGGAAATN CTTGGGGGNN ATTGNGGAGC GTTTTTTANN GGGGATTGGG GGGNTNGGGN 120 

TGCNCCCNNA TATTCCCGGC TNAGGGGCAA CCCGAGGGGT NNTNTCCGAC CATGTAACTT 180 

GTTTC GGAAT GAGGGGGAAT GCNNATTNTG ANTATTGAAN NGNGACCCGG NGGGGNCNTG 24 0 

TTNNAATTAA CCTNNTACCC GGAATTT CNG CGAGANCGNG ANGATNNCTG GCACTTNTTC 300 

CGTATTACGN GTGGCGTTCN NGANTGCAGG GGNTGCCCTT GTTTGNNTTT CTGAGGGTTT 360 

CTTATANGCA GATTGTGGGG TTGGAAACGA GANATCCCTN ANGTAATGCC ANNTCACACG 420 

GGATGGAGCA GGAACNCCCT ACGNATAGTT NACCTTCANT CAGGGTGGGG AANC GATNGA 480 

CCNGAGGTAT ATGGGCNGAA CNGGACATGT NGGGNNANCC GTTCAATC 528 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 927 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AANACGGTTT AATAAGGGGG ATGTTCAAAA CNCCACTCCG GGGGAANAAA ANAAAAAATT 60 

AGGGGGGGAG AANGGATTGG NGTATAGTTT CCCACCACAA ACCTNGTTCC ATTTTTTCGG 120 

GGGGGNAACG GAGGNCAT G A TTATGGGGTG AAGGCAGCAC CCACCCATTT TTCGGGGGNA 180 

AGT CAGTTTT TTTT GGTANA ATCAAAGTTC CTTCGAACAT NT C GT TTTAT CCAAGGAGTT 2 40 

TTGGTGTTAA ATTAGCANTT TNTGNGAGTT TCAAAGTTNT GGTTCCNGAG NAGNTTTGTA 300 

ATTGGTTCAC CGGTTNTTTT GNGCCAGGAA AGCAGACCCN TGTTNGGAGG GGAGATTCCN 360 
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ATTTTTAGTT CCCATTTGGT GTTTCCNTAG GTAATGGAGT CTGCAGACAG TTTGAGTNTA 420 

NT GAGTT GAG TCCCTTNTCC TATCAGCCGG GGTGGCATTC TGTCCAAAGG AGGAATCCAG 480 

CAGCCAGATT AGATTTCAGT NT CNTTTNTA ACAGGGAAGT TAGACACACC CGGCCAGNTT 540 

GCAGCCTTTC CACCCCCAAN GAGTGAACCC TGCCNTTTCA GCTTTTACCC AATTTACTTT 600 

CGTTGGCTTA GCATGCAGAT TNTTTGGCTC CATGCCCGGA GCAGCTGACA TGGGAGGCTT 660 

TGAAACTTCC ATTAT CAT AG AATGGCAGGC AGGTCCTTTG CGGTTAAAAC CAGGAGCCTG 720 

GGCCNAATGA GAT GGNT CAN TGAGCAAAGG CGNTTACTGC CAACCCTGAT GCCTTCAGTT 780 

TAGTNTTGGA ATTCACAGGG TAGAAGTTGA ANACNTTTGA CTCTTCAAAA GTTGTCCCTG 8 40 

TAGCAGGGCA GNNGTGGTGC ATNCCTTTAA TTTGGGCTAC TTTGTGAAAG ATATCCACAA 900 

NGAACCTTGG CAAGTAGAGG ANGTCGT 927 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 911 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GGGAGTTTGC TCTCAGAGNG CCNATTACGC NACAGGGGGN GTCTCACANT ATAANCTCAT 60 

ATANNATACT CTACNNTNCC CCCCCTNANG TNT C AAGGGC AAGAGAATAT NNTCTCTCTC 120 

NTATCGTCTN GGGGNNTCTN AAATGTTTGN GCTCCCCGGG NAAAATANNT CTCTNTCNCG 180 

NCTCTATNTT CTCNCCTCAC ATATNTGCGN ACTCTTTCTC NNCCACANNA AAAGCGCCCA 240 

GTGNGGGGAN CTCNNAGAGT GTATNGNGAA GAACTGNNAG TGTNTNTGGG GCGCGTTCTC 300 

GGGGAGANNA TACNCTTCTC TCNTCTCTCT NTAGAGTGNG ATGTANAAAA CCNCANNTGT 360 

TGCANAGANA AATGGGGCTC NGAGNCTCTT ATATTTCCCC NCCCCTCTCN C CAT AT ATNA 420 

CCTNCGGGGG CTTNTNTNTA AATCNCCTNT CNCCATTNTT NNNANNNGCG TGTTTNTATT 4 80 

GTNNGTNTCC NCNTGNTCCA AAAATCT CAA ATTTGTGTCT CTTNTCCCAA ACNCTATNTC 54 0 

TCCCNTANCC CTGGGGGNGT NTATTATNTN TNTNTATATN CNTATNTTAT ATACNTATAN 600 

TNTATNTNNT ATATATTTGG GGTCNTTACC AAAACCCCNT TTTTNTCTCA CTTTTCNTCN 660 

ACTCCCTTCC CGGGGCCTNG AAANTTTATT NCCNNCCNTT NNGNTCCTTT TCTNTTAAAT 720 

TCNTTNCNTN NGGAAAACCC TTTTCNAAAC NGGNTTTCCC CTTTTNNCNT CCCNCTCAAA 78 0 

CCCCCCAAAT TNGGGCATTT TTTCTTTTCC CCTCACCNAA CCCCNTTTNC CTCCCCCCNC 840 
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CCCCCCCAAA NTGNGAATAC CCTGNTTTTC AGNGGNNNNG AAAAATCCCT CCCCGANGGN 900 

GCCCCCCTCC T 91 1 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 880 base pairs 

(B) TYPE : nucleic acid 

( C > STRANDEDNESS : double 
( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GGGCAC CAAC GGN GGAAGAG 
TTANTTTTNA AAAAGGNCAC 
GAAACCCTCN GACGGTTTTC 
TTTCNTTTTT GAGCAAATTG 
TTAAACGTAA CGCAGNTTTG 
TAANGNAAGN GGTT CAAGAG 
NCTTTAANAC AGGTNNNAAA 
ANGGGTAAGT GNTTGGCACA 
CNTTGATCGN GNGGTTGTTT 
TTTCTTCCTN GGTGCCNCAN 
AT GGGCAGGT TGGGTACAGA 
GNCT GTCACA CAGACACTGC 
GNGNNGGAGA CGCTNCAGNG 
TTNTGAGCNT CTGGTCCCNG 

CAAGGGCGTC TCCACAAGAC 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 923 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



TTTTCCANGG 


TANAAGAAAG 


NAGGANTGGG 


NCGANAANAA 


60 


CAGATANAAA 


AAACTTTTNA 


GGGGNGTTAA 


NAAAAANGCN 


120 


NN GANTNTTA 


AANAGATTCA 


GGGGAAGCAC 


GAGATTATCT 


180 


CCAGCAGGGA 


ACNGACNAGA 


GGNTNGGTTT 


TTGNATNCNN 


240 


GANAAACACA 


GNTNACATGG 


AAAGACCTGG 


GNNATTAGGG 


300 


AGAGCCGATG 


AAATNGCCNG 


GTCCAAAATC 


TTTTTCCTTG 


360 


AATNNGGCTG 


CTGTTTATAA 


CNATAGNTAA 


GT GAANN AC A 


420 


GNCCAGGGTA 


AGTAGGCATN 


NAAGGAATGT 


TAAACATNAC 


480 


ACACCGCNTT 


AAAGAAANGT 


TTAAAAATAT 


CCCTGGGCTG 


540 


GGNGAACGAC 


AAGCCAAGCG 


NAT GANTC AC 


AGGAGACGAC 


600 


AT CAGTGTT C 


AGAGACTCCA 


GGGGCACCCA 


GATTCCNTCA 


660 


TCCCAGGGAC 


AACCCTCCGG 


GATGTGAGGN 


NANGACTTCC 


720 


ANGGGACACT 


CCTGGTGGTA 


GCACACATTC 


TTCAGTCNGA 


780 


CAGAGNACAG 


TGGNAATGAC 


TTTTTTCTTA 


CTTGNGNCTC 


840 


AGCGTGNCNA 


GTAGATAAGT 






880 
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GGGAGGAGTA CNGGANGGGT CCGACGTAAN TNTNTCACAG GNAAGNCGAN AN GAG GAG GG 60 

GTNGCGTAGG NNACAAAGAG ATAGGAACGG GGNCGNNAAC NTNNCNTNTN GAAAAGGCCG 120 

CCANNGTNAA NCAACTNTGG CGGGGGTGGG ACNNAAGGCG NGNGGCNNNA GAAGGTTTNN 180 

TTNNTTGNAA CCNAGATTCG AGGGACGGAC NGGANTATCN TATCCNTNTT NGTTNCGANT 240 

GCCNGCGNGN ATCNGGCNAG GGAGGGTNGG TTNNNNGGTT TCNGGNGACN NCC CCAGTTT 300 

NTGGNNNATA CCCNGCTCTC ACANGNNGGA CGNGGGTNTT TNNGGTGAGG AAGNNGCNTC 360 

CCCGCGAGAG CCCGNGGNAA GGGCGNGTCC AAAANTCTTN TTCCCTGCTT NTNCNACAGG 420 

CTNNGANANN ATNNGGCTGN TGTTNATCNC NATAGGTAGN TCAACCNNCA NGGGGANGTG 480 

CTNNCACACC CCAGGTTAGT GTCCCNTNCA NGGTATGTTA ANACGTTACC NNTGATCGGG 540 

GGTTNTTTAC NNAAAANNAA AAAAAAANTC ACCNTCCCGG GCNTGNTGNT TCCTNGGGGC 600 

CCCANGGTGA ACGACNANCC AANCTNTTGA NTNACAAGGG ACGACGTGNG CAGGTTGNCG 660 

TNCNGAGTCA GTGTTCAGAG ANTTCNGGGG CACCCCTGAT TCCCNCGGNN GTNACACAGA 720 

NACTGNTCCA GGNNCNNCCC TCCGGTTGNG AGT CNAAGAC TTCNGGNNGG TGACNCTACN 780 

GTGANNGGAC ACTTCGTGGN GGTGNCNCAC ATTCGTCGGT CGGCTTANGA NCNTCTNGGT 840 

CCCNGCAGAG CACTNTNGCA ATGNCTTTNT TTGTTCTGGG GCTTCCNAAT GGGTCCTCCC 900 

AAAAGNCNGC TTTAGCTGTA ATA 923 
<2> INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 880 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ANANAGAGTA ANTAANANAA GAGGAAGAGA NAAGAAAGNA GAAGGNAAGG ANANAAANGG 60 

GNNGGCGAGG AAAAAAGGAA AGGAGAANAA TAAAAGAAAA AGTGAGGAAG GAAGGAGTAN 120 

NAGAAAAAAG NAAAGNGGAG ATAGNAGAAA GGNCCGGNGG ANAAAAGANT AGATTAANGA 180 

NAGNTGAAAG AATAAAGANN ANGGC GANAA GGAAAGAAGA NCGAGNATTA GAA7VNAAGAG 24 0 

AGGAAAGANN NGGGGGGAGG GAANGAGGCG AANTCNNGAG ANCAGTNNAN AAGGCAAGAG 300 

AATNAGGAGN AGANANGAAG NNNANGANGA AGGAGGGGAA AGAGGGNACA GAAAAAACAA 360 

GTANAGTAAC CNACNNCNGC GAGNGNGCCA AATAGGTNGC GCCAGCNACA NGGCCCGAGC 420 

CCNGGGCGAG GGGGCAT CAN GAGCCAAGGG GAGCGGGTCC AGNCNTAGTT NTGAAAGGAA 480 
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AGGGGAGGNG GGNAGATATT ATATGGTCGN GCCCCCCCCN GTGTCTCGGT GAAAAAAAAA 540 

AGGNGTGANN AGCAGGGCCN TNTTGGNTGN GGGATCGNGC AT GAT C AGAG ACCNGAGGCC 600 

GGACNTTCCG CNGNGCCTTC CGTAGGCCCA NTGTCAAATG TATTCAAGCC GGTTNGAAGG 660 

ATGCCGGNGN TAGNGANTGA TGCGGGGGCC NGCCCCCCCG GNTTTCCGCC CCCGCAGCCN 720 

CNGTGGCCGC CATNACGGAG TTCCCAGTGG TGAGNGTGCG GAGNTGAGGC CCCGCGGGTC 780 

GCCGCCGGTC CCCGCAGACA GGAACGCGGA GCGNNCCCTG CGCTNGAACG TANGGGNCCA 840 

CTTGAAAGAC TNNACNAAAN GACGCNGATT TGTAGAAAAG 880 
(2) INFORMATION FOR SEQ ID NO: 16: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
ATTCTTCAGC TTTTGCNTAG AGGAAAAAGA AT GGATT GTT TCTAGGACAA CCTGCTGAGG 60 
TGCTCACCNA GNGTTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC 120 
TNTGNCTCTC TCCTGAANNT CCCCANAGGN NCTTNGCAGN AAAANG 166 
(2) INFORMATION FOR SEQ ID NO: 17: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 162 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CNTTTTNCTG CNAAGNNCCT NT GGGGANNT TCAGGAGAGA GNCANAGAGA GAGAGAGAGA 60 
GAGAGAGAGA GAGAGAGAGA GAGAGAGAGA GAACNCTNGG TGAGCACCTC AGCAGGTTGT 120 
CCTAGAAACA AT CCATT CTT TTTCCTCTAN GCAAAAGCTG AA 162 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 871 base pairs 

( B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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TYPE: DNA 



( genomic ) 



PCT/US97/06067 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GAATAAAACC CCAGAAAGGT TTTAAAACAT TCCGTATAGA AGTT GATNAA TTNAAATAAT 60 

TGGAGGTGAA ATACACAGAG GGTTTTTCAA TTAATCAATA AAAAAATAAA TTACNTACNT 120 

NTTTT GGGGG GTTTT AT GNA NAAANGAATT GGAGGGATCA ATTTGCAAGA AATTTATTTT 180 

TTNGTATTAT TTAAAAACCG TTANGGATTC NGTTGATTTT AAATCAAGCA GTAAATATAT 240 

TAAAAGGTAG GAGAATGGTA TCAATAGGCC AAGATAACAG AGTGTAAAAG TTAAAAGTAT 300 

TGGACAGAAA TATTAAGAGT TATT GTTAAG ATCCNGGACT TTGGAAAATT TAAAACCAAG 360 

CGATTTAGGC CAAGTTATTT CCACAGTATG GT AT CAGAAG GAGTAAAGAG ACAGCACAGG 420 

TGCAGATNTG ACGGCTTGGT TCCTTAGGTT ATTGCCACAG CAACGGTCTT GGCCGCAAGG 4 80 

CAGGCTTGGG CCCAGCATGA GAAGAGAGGG GGAAC CAAGT TCTTCAGGGA CCNGACGGGC 540 

GGCGCCGGTG AGAAAGGACT TCATCTTGCC ATGNTCANTC AGCGAAACTG CAAACGCTTN 600 

TGGCAGAGAC AACGCCAGAT CTGCAGAGGC ATTC CGGCCT TTAACCGCTT TCCCACAGTC 660 

GGCCCACAGG CCTTACCGCA GCAGAAAGCG CGCGACCCGG AGGTCCCGCC AGTCAAAAGA 720 

AAAAGGGGGG CGCAAAACCA TATAAGGCNT GGAGCAGGCG GCCCGGCCCC GCCCCCAGGA 780 

CATGGGCCCG GCCCCAATCA TGCCCCGCCC C CAGGATT C G GTCCCGCCTC CTCCCGCTCC 840 

CGGGATGGGC CGTTATGCTC CCGATACGCA T 871 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 936 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TGGGATTCAA AAATTGGAAG TTANTTTTTN AGGAAATTTN TTTTTAAAAT TNTAATTGGG 60 

GGGNNTNGCC ACCAATTAAA ANGNGTTTGA ATTNAAAANG ATTGCCGGGG GAAAAANCCA 12 0 

TTTNCTGCAN GGAATTAACC AAGTAATTTG GNTTGGNAGC ACTNGTTTTG GGCCTNTAAA 180 

AGGCATTTTA AANACAAATT AACAGGGCNG GCATNTTCAA CGGGNGNTAG NTTGTTTTNA 240 

TGAAACNGAG GNTTTTGGGG GCGGGCCTTT CCNATTNGTT TCCTTTTTTA GGATTAACAG 3 00 

ATGNGAAAAA AAATNATGGT T TTAT AT CAT CGTTNTTGGC AT CAGCAGAT TGGCNATTCA 360 
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ATTAAAACAG AT CATT CAT G ATNGGCTTTT TGGCCATTAC CAT GN AAACA CAAAGAGCCA 420 

GGGTTTGATT GCCCTGACCC GCCNACCTTC GGTTGCTTAG GTGAGGTGCA GCACTGCGTT 4 80 

TTTCCTTTTC GGACTGAAAA CAGGCGAATG AATCATTTCN GTCGTGTCTT GAGGGTGCAT 54 0 

TTTTNACATT TTTGTGCCNT GCTGTGCGCC GGTGTGTGAT TTCCCTGTTT TAAGTGGCCC 600 

CTGAGGATAA CAGTGAAGTG CTGTCTAGCA TTCTTCTGCG CAGGAAGGCG GAGATCTGCC 660 

CTGCGGAGAA AGTATGCGTG CTGGATAAGC ATTACTGAGC ATGACACAGA GCACCGTTGA 720 

CCCCGAGTGC AGCGTTAGTG AACCGGCCAA TGTGCTGGGG GATTTTAAAT GGAAT CACAC 780 

AGAAGCT GAG GCTGAGGATT GATCTGTGAG TAACAAGTTG TGAATGAGGC TGGCAGGAGC 640 

TAGCCTGGGA GTAAGATTCA GTGTTTGNTA ACAGCGTGCA GGCATTAAGC CAGGGAACTG 900 

AAAGTNCCCA CANNGNCTTT GGCAAGTAAG AAGTCG 936 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 888 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<ii> MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



AGGNNGGGGG 


GGGAAACTTN 


TTTATNTGGA 


AAANTTTTGT 


TTNGGCGGGN 


AAGGAGTTTT 


60 


TAANAANGTT 


AANGGAAAAA 


GCTTTTANTT 


AANAT GACCT 


TTTT GGGGGA 


AANACAAANT 


120 


TGGTNNGTGT 


AT7NGNGAAA 


AAGATTTATT 


ATAAGATTTT 


TTATAANATT 


TTNGGGGGGG 


180 


AAATATTTCA 


AANAAAATTC 


TGTAACAAAA 


GGNTTTTTGT 


TTTTTGTTNT 


CCAAGNAGTT 


240 


NT C CAGGT AG 


TTNTCAACAA 


CNNANGCCNT 


AGGGAAGGAC 


AT CAT AT GGA 


TATTTTCANA 


300 


GATTTGTTTT 


TAGGAAACAT 


TNTAAAGTCA 


AGGTTAAGAT 


GACAGTCAAN 


TCCCANGAGN 


360 


GNGGTAACTG 


TNTGCTTCTT 


TATTTAAAAT 


TCAATATTCA 


GGATT T CATT 


TATACTAACA 


420 


AGANTAATTA 


C CAT CTTAAT 


GAAACATAAT 


TTGAATAATT 


TGCAAACAAT 


NTGATTTTTC 


480 


TTGAATATAC 


ATGTTACTAA 


AATATTANGG 


AT GCAAATAG 


NTAATAAACA 


AAT AG AT AN G 


540 


NAACC AT GGN 


ACACCCCTTC 


TGTGATTGGN 


GGGACNTGGG 


CATAAGGCTT 


GTTTGTATAA 


600 


TAATGTT CAT 


ATTTTACATT 


CTTCCTNNGA 


GGANGGTCCT 


CCCTGTTAAG 


AAAANGACTC 


660 


CAGGATAAGG 


AGACAGCACC 


AGTNTAGGAA 


GTGAGGNTCT 


GTTTAATGTC 


TTAGCAAAGT 


720 


AGTAAATGNT 


GGGAC CAT CA 


GAATAGCCCN 


TAAGGNTGTG 


GANAGAACTC 


TAAAAGCNTG 


780 


ATATATATAT 


ATATATATAT 


ATATATATAT 


ATATATATAT 


ATATATNTAT 


ATAAAGAGGC 


840 
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AGTATTGAAA GACNTNCACC AATNGAGCTG GCNAGCTAGA AGAGGTCG 8 88 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 903 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CTTGGAAGGT TTTTTTNNCA AAANCCNGGG NGGGTTTTTT TTAANAAANA GGNGAAAAGA 60 

TTTGGAAACT TTTTTTTTTG GTT GAAGTTA NTTGGGGATT GGGGGAAAAA TTAAAAGGAT 120 

TCAAAGTTCC CAT GGNTT GG AAGTANAACT TTTATTCAGA AGNGAAAGTT TTAATAATGA 180 

AANATGTTTT TTTGGATTNA CGGNGGNGGA ATT GGGGAGN GGAGAGAGAA GAGAGAGAGA 24 0 

GAGGGAGAGA GAGCCGGATC CGCANTCGGG GGTTTCTACC GGCAGAGCCA GGACGGAGAG 300 

GGTTTTCGGC AGCCGCNGCG GGTTCGGAGN TTTTAAGGTT TNTTAATCTT GGAAGGTGTC 360 

TGANATNACC CCGTTTCTTG TCGGTGATGT TTNGTACAAG CTTTCATTTC TTCAGGATTT 420 

CGGAGCGCCA ATTACTGCCC CGATNTGGTG TTTATGTTTG CCCGTTCNTG CGCNTGGCCC 480 

CGCGCCCGCC CGNGAGCTGC GTTTTCCCTG GCCGCGCGGC CCGAGGGGGT GGGTGGGGGG 540 

CCTTGGCCCG CGCACCCCAG CGCAAGGGAG GGGTCCCCTT CATTTTTTTT CATTGACTTC 600 

AGCACCATGT GATCAGGAAG TCTGGCTCCN TCCATTTCCC NTCCCGACTG AAGGGAAACA 660 

TTGTGTAGCA GCCCGCCGCG GCCACTGGTG GGATGGCNTT CGCTGGCCTG ANGTAGGGGG 72 0 

ATAAAAATAA CCGGCATATT TAAGGCCGGA GCAGGAATCC CGGCGCTCAC ACGCGGCCTG 7 80 

GTCAGTTCCC GAAGCCGCCA GCAGCGCTCT GCGCAGCGAG CTGCTGCTGC GCCAGCCAGN 840 

TCGGGAGTGC GGACACCGTG AAAGACCTTC ACCTATAGNG CNTGGCAAGC TAGAAGAGGT 900 

CGT 903 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
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TCGGGGGCAG 


GAAAANTTTG 


GCjvj 1 III ouw 


7XAAAAAAAAA 


ANGGGCANAA 


AC CC GGTNAA 


60 


CNTATTNGTT 


TTNGGCCCNG 


k * « PtT* 7\ TV TV KT TV 


UTTTTTTTTT 


NAAAANAT GG 


AAAAATTGAA 


120 


AAGGGANANG 


CAGGGAAGGG 


NGGNATTTTA 


i IN 1 W^->Vrvl' X x 


TCNGGTTCCT 


ACTTTTTTCC 


180 


NGATTCTGTC 


AGTTTCGCTT 


T AAGCAAAG G 




NMAGTTTCAG 


AAGTTAGGCT 


240 


TGCCTGAGAA 


AATTTCAATG 


GGT GGCAATT 


L 1 I AvjvjMv- X ^- 


a GGACAGGAT 


TCAGNGNGGA 


300 


CTAATNTGCA 


TTTNGGGATN 


TGTCCC I vjoIj 


r.TPrNTJVAGN 

U X WIN A rt"\Jli 


TCCGGACCGG 


GANAGATGTT 


360 


CNAGGGGGAG 


ACCCAANTAA 




T CZ AAAT TAT C 


AT GGCAGCNA 


CNNAC CAGTA 


420 


GTTGNTCTGG 


TAATAGAGCA 




AAACACGGTT 


GTTCCATTTG 


GATATATCCN 


480 


TGAAGTCCGG 


CCGTGCGAAA 




C C GGGAAGAA 


AT CAT C C C AG 


GCAC GGAGCG 


540 


GGGCAAt^ox I 


*r rx ix fiT C C AT 


GTTCTTTTGC 


TTGGCGAGCT 


TCGCCTTCGG 


AATCCGGAGG 


600 


CGGCGGCGGT 


AG CAACC AGC 


TGAATGAAAG 


AT GAC AGC GG 


CTCNTTCGbA 


1 X V7UW 1^1 


660 


GGTTAGAGCA 


CCGCAGGGCC 


CAGAAAATTG 


GCCGCGGGCG 


GGTGTGTTGG 


TCTTTCTGTG 


720 


ATTGGCTGGA 


AGTGGTTAGT 


GAC GGAAAAC 


TGTGGGCTTT 


ACCAAATGTA 


AAAC GGAGT A 


780 


CTAACAAAAA 


GTAACCAGCG 


GAAATGCCCC 


CCTAAACTAA 


AGGTGGTGTC 


AGTAGTCTCT 


840 


CTGGCAGTTT 


AAATACAAAC 


NATCTCTTTT 


TAGGCATTGT 


TTTGAAAGTC 


CCCACAAGGN 


900 


TTTGCAAGTA 


ANAAGTCG 










918 



(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

AGAGAGGGTT TAGCACAGGC AGCNTATTCC CAGTTTGTGC TGTAGAACTG GAACCTCAGG 60 

CCTCATTCTG AAATNT GCAG CCNTCCCCAG CATCCTTCNT GGCACAGCNT GGCACAGACN 120 
TGNTAAGT GT CTATTAGTGA CTAATACAAA GGAGT ATTT C AGAACGTTGG CACATCTCAG 

CACGTTGCAA CTGGCTGGAG CTGGTTGAGC TCTTGCTGCT TCCATATCCC TTTGTAGCTG 240 

CTCTCCACTT TTCTGAACCC CGGGTCCATG TGAAAGTCCC C AC AAGGNN C TTTGCAAGTA 300 



GAGAAGNCG 

(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 904 base pairs 



180 
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(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

TTT CATTTAA AACNCGGGGG NTGAACCCAA TCTTNANGGT GGCAGT GNGG NNGAT CTTAA 60 

CGGTTTTTNA GAAAAAAAAN TNCTTCGCTC NCACCCCCAA GCCTCCCNTT CTTANCAGCT 120 

TTTTTATANG AAAAAAGATG ATAACGAAAT TTTAAAAACC GTCGTTAGAG GAAATGAAGG 180 

TTCAGCCGAC CATTACCTGA NAGTAATGAA GGTNTTCCGG AGGGTTGCCT TCCAATCCCA 240 

GAT GGATTTG AGTTTCAGGA TCAATTCAGT TACCGNTGAC CATCCACCNN CCTCCNGTAT 300 

AATCATTNGA TGAGGATGAA TGGTGAGTGA GTGATGATGA T GAT GAT GAT GATGAAGGGA 360 

TGAGAAGNAC ACTAT GATAA CAAGTGTCTC AGTCCACATT AAGGTTTGCC TGNAAATTAG 420 

TGCATAAGCC ATGGGAGACA AATTCTTTTC NNACACAATT AATAGTNTCT TANTCCTTCC 480 

CATCTTCTCT GCCCCATTCT GTTTTCCACC ACAGGTCTGC AGCGGGCTAC AGCTTCCAGT 540 

CTCCAAGCAA ATACCAGAAC TGGAGGAGAA AATTC CAGTC CAGT GAGTCA TGGGCAGGGG 600 

GAGGGGTGGG GTAAGGGCAG TGGCGCTCAT TCCTNACATG GTGTCTTCTC TTGCCTAGCC 660 

TGGGATCTGA GGGCAAGAGA ACCTGTAAGC TTGATTTGAT TTCCACTGCT GACTGGAGTC 720 

ACT GCCAAGG GATTTGGGAC TTCTCCATCT CTCTCTCTAA CCTGAAATCC TTAGGATTCT 780 

ATTATTTCAC CGGACCAGAG CTGTAGCAGA GATGAGCTCC AAGTTT GAAA TGAGAAAGGG 840 

GAAATTGAGA GCTATGAGCT AGGNGCGAAA GNCCCCACAA AGNNTTTGGC AAGTAGAAAA 900 

GNCG 904 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 883 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
( D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

GGGGGGGGAA ACTTNTTTAT NTGGAAAANT TTTGTTTNGG CGGGNAAGGA GTTTTTAANA 60 

ANGTTAANGG AAAAAGCTTT TANTTAANAT GACCTTTTTG GGGGAAANAC AAANTT GGTN 120 

NGTGTATTNG NGAAAAAGAT TTATTATAAG ATTTTTTATA ANATT TTNGG GG GGGAAAT A 180 
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TTTCAAANAA AATT CTGTAA CAAAAGGNTT TTTGTTTTTT GTTNTCCAAG NAGTTNTCCA 240 

GGTAGTTNTC AACAACNNAN GCCNTAGGGA AGGACATCAT ATGGATATTT TCANAGATTT 300 

GTTTTTAGGA AACATTNTAA AGT CAAGGT T AAGAT GACAG TCAANTCCCA NGAGNGNGGT 360 

AACTGTNTGC TTCTTTATTT AAAATTCAAT ATTCAGGATT TCATTTATAC TAACAAGANT 420 

AATTACCATC TTAATGAAAC AT AATTT GAA TAATTTGCAA ACAATNTGAT TTTTCTTGAA 480 

TATACATGTT ACTAAAATAT TANGGATGCA AATAGNTAAT AAACAAATAG ATANGNAACC 540 

AT GGNAC ACC CCTTCTGTGA TTGGNGGGAC NTGGGCATAA GGCTTGTTTG TATAATAATG 600 

TTCATATTTT ACATTCTTCC TNNGAGGANG GTCCTCCCTG TTAAGAAAAN GACTCCAGGA 660 

TAAGGAGACA GC AC C AGT NT AGGAAGT GAG GNTCTGTTTA ATGTCTTAGC AAAGTAGTAA 720 

ATGNTGGGAC CAT CAGAAT A GC CCNTAAGG NTGTGGANAG AACTCTAAAA GCNT GAT AT A 780 

TATATATATA TATATATATA TATATATATA TATATATATA TNTATATAAA GAGGCAGTAT 8 40 
T GAAAGAC NT NCACCAATNG AGCTGGCNAG CTAGAAGAGG TCG 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 base pairs 
.(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



883 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



TTTGGAAGGN 


TTTTNAGGAA 


AGAAANTGTN 


TTTNAGGGNA 


GGGAACCCTA 


TTCCGACGGG 


60 


j 

TTGGGGGAAA 


ATTTTGGGTT 


GACCCTTCGT 


TAAAAAGGGT 


TNCGGTAAAA 


GGGGGCNANG 


120 


TNTTNNAANA 


AAAATAATAG 


TAATAGTAGT 


AGTAATAGTA 


TTAATAATAA 


TAATAATTGC 


1B0 


AGGAATCCTG 


TN AC C NT C AG 


GAATT GGGGA 


AGTAGTTTCT 


TATTTTAGGA 


CCAGGTGTTT 


240 


T GTTTC AGGG 


GAGTTATTTT 


TTGTTTTGTG 


GAT GGGAT GA 


GTGGTNTCAA 


TTGCTTTNAA 


300 


AAACCT GTAT 


TAGTTTTGGC 


AC AGT T AGT G 


TGTNTCNGNT 


TCGTTNGAGG 


AGTTT GAACT 


360 


GGATGGTAGG 


CAATGGNTGC 


ACAGATTCAT 


AGT GG C CAGA 


GTTAGAGTAA 


ATGCTTGCGG 


420 


AGCAGTCAGA 


ATAGATGAGA 


NT CAGGGAC C 


C GGCAGAT GA 


TGCAGGGAGA 


AT GT AAGAGC 


480 


AGAAGGT GGT 


GGGTAGCATG 


TGGAATGCAC 


ATTTCCAGGC 


GTGACATGAN 


TCGGAACAGC 


540 


TGTGACTGCT 


TAGACCAAAG 


TGATCCCATC 


AACACGGCCA 


TTCAGTAAGG 


AAGGGTCATG 


600 


GGNTCCCCCC 


NTCCCTTAGG 


AT TN AC AT AC 


AGATAATGAT 


T GATT GGT GG 


ACCAGGGGAA 


660 


TGGGGAAAAA 


TGTCNTTTTC 


GT T GGT AT AG 


T C ACT GGT AG 


CTGCCCATGT 


TTNTATAAAC 


720 
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AAATTNTAAA GAAANTCATT GGTTCATACA CGTAAGAAGA CAT C AAAACA GAACT GAGGC 780 

AAGTT GGGAA GAGAAATGGG ATTAGTAGGA GAGGGTCAAG AAAAGGCAAA GGTATGTGCA 840 

CATGCATGAA TACATT GTAT ACATGTATGA AAGNGCCACA ATGATGANTT ACCCCANATG 900 

GNNGTTTGGC AAGTAAAAGA GTCG 924 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 482 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

TCTCTCCTGA GGGGGGTTTT NTGGANGAAT AGAAGAANAN ACCNCCTCTT TGTTTCNTCC 60 

TGTGGNGNNC CCTGCTGNTA AAGNNGATTT NCNCGGTGNT ATACANNTAA GAAGGAGGAT 120 

CTCTCCCCCC ATTGTNANAG AACCCCGTGT GTGGGGAGGG GGTGTNGCCA CNANCCAGAN 180 

NTGGCCCNNG GGTCNTCTCC CCACTCNTNT GNAT AACNT C TNNCCTCCAC AAANACCCCA 240 

NANAAAANCA CCCCNCNTGT GAGNNCNGCA GANGCGCCCT NTNACAAGAN AAGAGNNCAT 300 

GTGNTGTGGC CCTGTGCTNN GACANTNTAN ACTCTTCTNT NGNGGGGNGN GGNCTGTGGT 360 

TTTATAAGAG NGTGTNNCCG TGGGGGGGAG AGTANTCNTT TTATATAGAG AGANAGNGNC 420 

CTGTGNAAAC TNCCTCTGAG AAGAGCACCN TGGTGTTCTC TCCCATCTNC TAGNAGGGGA 480 

GG 482 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 460 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

TAGCTTCTCT GTGAGGGGTA GAACT CAAGC TCCCCCATGA ACAGGCTTTG GGGTTCCTGC 60 

CATCCCCTGG GGCTGTTCAT TAGGTGCCCA CACAGACTTC TCATGCCATG ACT C ACACTT 120 

GACGTCACAG AGCACACAAA GAGCACAAAA GCAGGCTGAC CACATCCGGC CATGCACACC 180 

CCTTTAACAG TCCCAAGCTT TCTCTCTCTC TTCTAAGT CA CTGCCCTGGG AAGAC GGTTT 240 
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CAT AC CCAAG CTGATGT GCA CTTATTTCTT TGTGTTATTG CTCTGACAGT CTCACAGTGC 300 

TCTGCAAACA CTCTGCATTC GCCTTTACCA CACCAGAAGA AATTCCTCTT TGTGCAGGGA 360 

AAAATACATT CGTCTTAGTA GCTTCTACTT TCCAGCTTGT CCCTAGTCTG TCTGATATGT 420 
GGTTACGTAN TGTTAGGGGC CAC GGAAGGG GGGGGGGGGG 4 60 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 465 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 29: 

TCCCAAGACA AGAGGGGCTG AAGAACGGGG GGGGGAAGAA TCAGGAGTGT GTCGCTGCTT 60 

C C C ACAT AAA GAC GGCACCT ANATCTGTCT CTCTCGGTGT CTCCTCCCCA CCTGGGGCAG 120 

GGTGAGCTCT CTAGACAAGA GAGAGACTGT CACAGAGAGA GAGAGATGTG TCACCCCTGT 180 

GGAGATCAGA GNCNCCGACA CCTAGGGGAC AAATGGGGAT CTCTTTTTTT TTTCTCTCTC 240 

GAGACAGGGG GTCTCTGTGC AACACTT GCT GTTCT GGAGA TGTTCTGTAG ACCAGGGTGT 300 

CCCCCAACTC AGAGAGCCTC CTCCTTTNCA CAACTGTGTC GCCGCCGCCG CCGCCGCCGC 360 

CATCACCAGG CTATATTTAC TATTATCTCT ATTACTATTG TTGTGTGTTG TGTTGAGACA 420 

GGATGCTCAC GCATAACCCT ANCTATCCTA GTGATAGACC CCACC 465 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 568 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TNNCNNTTNC CTGNGGCCGN GTANCTCTGA GNGANAGTNT CCCCGAGAGG GGGGGTCTCA 60 

CNNTAGNTNT ANANAGTATN GNGTGCTCGA GTTT NNAGAG AGCTCTCTCT NNNTCTCTCT 120 

CCCCNGAGCT ATNGNNTTAG GGNTATGGCA CNNCNCGTCT CTCNNCNCCN TATNGAGNGG 180 

TGNGNTATNG GGGNGAGAGT NTCTGCCCGA GAC C CAC ATT CTCNGAGTNN GGNAGAGTNT 240 

GGGAGACACA CANCTCCGGG NAN AT CTNT C TCCNCCCCCC CAGGGGCGGT GGTNCANATN 300 
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GNCNACAGAG CCNCNGNNTT NTATGTGGAG AGGGGATATC NCANCNCACN CCCNGAGCAC 360 

AGGNTCCACA CNCAGAGANG TGTCTCTCCC CAN C AC AC AA GCACNTCTGG TGAGNTCTAN 420 

GTTTT GNGAG AGACNNTGCC CTGTCTCCCT TTTCCCCGCT CTNACACACA TGAGAGGGTG 4B0 

TGCACATCTT CCCCATGTCC CTCTCTAAAA CCNCCCCAGA NTTTTGNGGT TNTGTGCAAN 540 

ACC CTTTT CA CNCTCANGGG AGATNTTT 568 
(2) INFORMATION FOR SEQ ID NO: 31: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 920 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GAGGGTTANT TGGCCCAANT CGGCAAT CAT CCNGGGAAGA AGANGNCAGG GTTTNGGCAA 60 

ATC GGAAGAT CAAGGACGCA ATTCGNGGGG GGGGATGGAT AGNNGCNAAA GGGNACN GAA 120 

AGNNGGATTG GNAGGNAAAA TTAAACGGGA GTTGTAATCC AAAAGGACGA CAAGGCAAAA 180 

ACAAATCCGG NAGTAAGCAG GAAGCACAGT GAANTTGGGG GAGGCAGNGT GGNGNAANTA 240 

AAAAATNGTT TTTTTAATCC CAATANGGTC AACANGTAGG CAANTGGATN TATTAGATAT 300 

TATATCTTAG CGCAAGNTTN TCACCCATTG GTCCAACCCA TATAACATGG CGGTGGTNAA 360 

TNTNTGAGCN TGGCACAATT TTTNACCCAT TAGTTCCCAA GGCAGATCGC CACCATGCCA 420 

GAANAAAATC CCAATTC CAT GGTGGCCCAG TGTGTCCAGC CACCAATANT TTCTTGAATT 480 

CAATTAAATC ACC AC AT GAA GGAATACATA ACACAATAAC AT CT GAT C CA ATTGATAAGA 540 

TATAATTTGC TCACNTAGAC AT ACAAAAT C CTGTACATTC CATCTCTTAA GAATATTCAT 600 

AACAAACTAT AAATGTGTAG AGAGGAATTT TAATAT CCAC TTCCATGTTC TCTTGGCTGC 660 

TCCTCTCTCC CAGTCTCCTC CTCCTCCTTT AAAACTTTTT TCTCCCACCC AT C ATTTTTT 720 

TTTGTCCNAA GGACGGGCCT TGTTNTATCC TGNACCTGCN TTCGTCTGCA TAAGGCCATC 7 80 

ATCCCACAGG CAGGACTGGA GCAATGGCTC ATTGGTTAAG AGCACTTGCT GATCTTGAAG 840 

AAGACCAGGG TGCAATTCTC AGAGCACTNC ACTGCTNCAC ACTGAAAGAC CCCACNNGTA 900 

GGTTT GGCAA GTAGAAGAGA 920 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 176 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
T TGAC CAT AT TATTTTTATT CACGTTGGGA CAAAAGAGCA AACGCAAAGG ATAGGAAACG 
AAAGGAATTA ATTTCCTTTC AATAGAGATA TCGGTTTTTT TTAGAGGGAA AAAATTGAGT 
ATTAGAAAAT AAAAATAGGT TTCGGAATTT CCGGAAAGAC CACTAAATTG TAGGTT 
(2) INFORMATION FOR SEQ ID NO: 33: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AAAAGGGNTN CCGAANAAAA ANAATTNGGA TCTTNTGGGG GCCCNGAGGN AAAAAAAANA 
NT AAN CNGGG GGNGACCCAG N GAANAG AC A AATTNTTTTN CCNGGAGTCC TTGGGGTGNN 
ANGCCAAACN GNCGTTTANN GNAANNNGNC GNGNTACCNC TTCGGAGNGG GGGCGCTGNA 
AAAGAATNGT GAGAATNCNG TTACNNGT GT TGNTTNATCN GAGATAGTNG TNTGTAACAA 
CCCCGATTCA GCCNGAAAGT TACGCATATG CGNANCGTTG TGTGAATCGA ACCTGGNNAA 
AACAGAC c C A TNGNCAAGNG GCAGACCNAA CGGAAC 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TGAATAAGGG TACAAAGATT GTGTTTCAGA GGAGAGAGGT AACAAGAAAA GACTCCTAAC 
GCAATGGCCA GAGGGC CAAG AAAAAGGGAA AA 
(2) INFORMATION FOR SEQ ID NO: 35: 



60 
120 
176 



60 
120 
180 
240 
300 
336 



60 
92 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 838 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GGNGTNATTT TCTTCTNGTG AANTCTTTNC CAAATCCGNG GGTNTGNCCC ANNGCCCCNN 60 

TTTATACACN NNATTACNCN TNNNCCAAAA CNCTATATGT NTCGANATGT CCCATNTTAA 120 

ANATATGNGA CTCAGTTTGA GTNT C C C CAN NTTGGNGTTG GGGTATNTGG GTAAANACAN 18 0 

NGACCCTCTN NGGNGNTTTA TTTATATATN NGNCCCNATA TAACNCAGAG ATCTGTGTAA 240 

AAAATATNNC NNTTCGCGGG GNGGGAGATT TCTCTCTGNN GTAGNGCNCT CNNCTGAGAN 300 

GCACAGNGCC CTGTGTTNTN TCCCCCTCNC CGAAAANAAT TTTNTNCAAA AANANANAAT 360 

ATNNACANAC CCCNANAAAT ATNCCCCTTN TCTACCNCCC CTCAAANACA CCNCNNTTTT 42 0 

TTTTTNCCCC TCAGAAATNT TTNTAATNTG GGNNAAAAAA ATCTNNGNTG GNNTTNT CCC 480 

CCCNTTTNNA GNCGCCCCCT NNAAACCCCC NCTNTTNANA GANAAATATG TANACTCNTA 540 

TTTAAAAAAN AACANTTTTT GTTNGGGCTN GGGTNTNCCA NCCCTTCACT CTCTTTGTGG 600 

GTNTNCCTTN CCATATNCCC CCTNTTTGAG ACNTTTAAAN AACCCTCTCC CTAATTCCTC 660 

CNCCCNCTGT TTCCCCCTTT TNNAAAAACN TCNGGCCCCT TNGCCCCCCT TTTCTNACTC 720 

CCTCTTNTCC NGAGATTTTT TCCTCNTNNT NNCTAATTCC NTTNTTCNAN TCTANATNNC 7 80 

NNTGTTNCNA NCGCANGNTN NCCCCNCCTT NNNCTNAATT NTNGGGNAGG TTCCAACC 838 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 314 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

CAAACCAGAA ATGGCCCAAG GGT CAT CTC C CCACT CAGT A TGAATAAGAT CTAACCTCCA 60 

CAAAAACCCC AAAAAAAAAC ACC CCAGATG TGAGAACAGC AGAAGCGCCC TATAACAAGA 120 

AAAGAGAACA TGTGATGTGG CCCTGTGCTA AGACAATATA AACTCTTCTA TAGAGGGGAG 180 

AGGACT GTGG TTTTATAAGA GAGTGTAAC C GTGGGGGGGA GAGTAATCAT TTTTATATAG 240 
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AGAGAAAGAG ACCTGTGAAA ACTACCTCTG AGAAGAGCAC CAT GGT GTT C TCTCCCATCT 300 

314 

ACTAGAAGGG GAGG 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 226 base pairs 

(B) TYPE: nucleic acid 

< C ) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

AGGGGGGGAA ACCCCTTCGC CNCGGGCCTA TCGNAANTTT TNNTCCACCG TAAAANATTT 60 

NCCANGNGCN CCATGTANGG ATT GNGGGNG TAGTGGGGGG AACGATTNTG GAGGGGC CTA 120 

AAAGGNANAT AGAGGACGTA TTGTATTTGG TTTTGCNGAG CCAGTACCTT NGAAAAAGGT 180 

TGGTATTTTT GATCCGGCAA CAACCACNGT GGTAGNGTGT TTTTTT 226 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 843 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 



GAATTAAAAC 


GGGAAAGATT 


GGAATT CAAT 


TTCTTACAGC 


CAAAAGCTAG 


ACCGGGCATA 


60 


TAGGAGATTA 


TTTCGATTTA 


GCACCTTCCA 


AAGCCTGCCC 


CAGATTTAAA 


GTTTAGGGGT 


120 


ATTATTTAAA 


AGCAGGTTCC 


GGGAAGTTCC 


AAGATAGGCC 


TAGAGGTAAT 


GGTAT GCAAG 


180 


CAGTCCTAGG 


TTTCAGAAGA 


GTT C AAAC AC 


GGGTCTTCAG 


GAAAAGACGG 


AAAGTGTAGA 


240 


TTGATCAGGC 


CAGCAATCAT 


ACAACAGTGT 


TTGTTGTAGT 


ATTACCTTTT 


CTAATGGTTG 


300 


TCACTGAAAG 


GAGATTATTC 


TAGGTTTGGA 


GATACAAAAT 


TAAAAGAATA 


AACCCCAAAA 


360 


GGCCACAGAC 


CCAGGGTAAG 


CCCTGTAGCC 


AGGACTAGCA 


GGC CATAAAG 


AAAAAGGAGC 


420 


ACAGGAAACA 


CTGTCCAGGC 


AGGACT GGC A 


AGCCATAAAG 


ATAAGGAAAA 


GGAATGCAGG 


480 


AACCAGCCTG 


AGTTAATGAG 


AAAAATTAAT 


GGGACGTCTG 


GCAGGAAGAC 


ATCTCCCCCT 


540 


AGCACACTCC 


GGGCCATATC 


TCAACTAGGT 


GTCCTCCAGC 


CCCTGACTTA 


TAGCAC GTAC 


600 


TCTATCTGCT 


TT GTT AT C AC 


AGATATGTTT 


GAATGAGCCA 


ATTGT AT GT A 


ACCACGCCAA 


660 
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AACCCCCTAG CTTTGTCTAT ATAACCGTCT GACTTTTGAG TTTCGTGTTC AACTCCTCTG 



720 



TATCTTGGGT GAGACACGTG TTGGCCCGGA GCTTCGTTAT TATTAAACGA CCTCTTGCTA 



780 



TTACATCATG ACCAGTCTGG TCCTGTTGTA AGACATTGGC AAAAGAGCCT GAAAACTAGA 



840 



AAA 



843 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

TTTTTTTTTT GGAAAAACGG GTTTAATAAG GGGNANGNAT CCGAACCCCC ACTCGGGNGA 60 

AAGGAAANAA AANAATANGG GGGGAANAAN GANTTGGNGG TAATGCTTTA CCACGACAAA 120 

CTAGTCCCAT TNTTCGGGGG GGGAAAGGGA N GGC ATGAAT AAT GGGGTGA AGGCNGGCAC 180 

CCACCCCATT TTTT CGGGGG TAAGTCNGTT TTTTTTTGGT ANATCAAAGT TCCTTTCGGA 240 

ANATGTCCGT TTNATCCAAG GNGTTTTGGG TGTTNNAATT AGNATTTNNG NGAGTTTCAA 300 

AAGTTTGTGT TCNNGAGNAG TTTGTAATTG GTTCAGCNGG TTTTTTTGTG NCAGGAAAGC 360 

AGACCCNTGT TTGGGAGGGA GATCCAATTT TNTAGTTCCC ATTTGGCTGT TTCCTTAGTA 420 

ATGGGTCTGC AGACAGTNTG AAGTNTATGA GTTGGTCCCT TCTCNTATCA GCCCGGGGTG 480 

GCATTNTGTC CAAAGGAGGA AATCCAGCAG CCAGACTAGA TTTCAGTNTC CTTTNTAACA 54 0 

GGGAAGTTAG ACACACCCGG CCAGTTGCAG CCTTTCCACC CCCAANGAGT GAACCCTGCC 600 

NTTT CAGNTT TNACCCAATT TACTTTCGTT GGCTTAGCAT GCAGANT CTT TGGCTCCATG 660 

CCCGGAGCAG CTGACATGGG AGGCTTTGAA ACTTCCATTA TCATAGAATG GCAGGCAGGT 720 

CNTTT GCGGT TAAAACCAGG AGCNTGGGCC AATGAGATGG NTCANTGAGC AAAGGCGCTT 780 

ACTGCCAACC CTGATGCCNT CAGTTTAGTN TTGGAATTCA CAGGGTAGAA GTTGAAAACC 840 

TTTGACTCTT CAAAAGTTGT CCTGTAGCAG GGCAGTGGTG GTGCANACNT TTAATTGNNG 900 

TACTTGTGAT AGTCCCACAA GGANCTTNGC AAGTAAGAAG TCG 943 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: ONA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 



ACTTCTCTAC 


TTGCCATGGT 


CCTTGTGGAA 


TCTTTCAATC 


TGTGTCCTTA 


GAACGCTAAG 


60 


CTAAGACTTG 


ACCTTGGCTC 


CCAGGGCGGG 


CTGGGACTTG 


GCCACCCCGT 


GAAAAGGGCT 


120 


CTTTCTCAGG 


CAGGT GTTTT 


CGTTTAAGAA 


AAT AAAC CAT 


CCAAGTCCGG 


GCAGACT GAG 


180 


AGCTACACAC 


CCCTCCAAGC 


CAAT CTGGAG 


TGGCTCTGCC 


CAACCCCCAC 


TGCTGGGAAA 


240 


ACATGGCTGC 


CTCAGCACCT 


CCCTAAATGA 


AGGGAACAGA 


GTGTCTCCTG 


TGGCCTTGAA 


300 


AATATTAATA 


AAT GAGACTT 


AACCTGATGG 


CTCAAGGCTC 


TCAGGGGGCT 


TTTTTTTGTT 


360 


TTTACACACT 


CTGTGGAGCT 


GTTACAAbw 1 




TTTGCATGGG 


ACAGACAATC 


420 


T GTTTTAAT A 


TTTT AT AT GT 


TTGTCTTTTA 


AAAAACCTAA 


GATCTAT AT C 


TTTTTACATT 


480 


TTATTGTTTT 


GTTCAAAAAA 


AAAAGTTTTA 


C AC AAT GAT C 


AAAAAGTTCA 


AATGAAGTCT 


540 


TTTTTAAACC 


TCTCTCCTGC 


CAAAGGAAAC 


CAAGCAAACT 


TTTTCCAGAA 


AC CTGATAAG 


600 


AATATCTCCC 


TTTTACCCTG 


GAAACATTAA 


AAATAAGGAT 


CCCTGAATTA 


AAAATT CTAT 


660 


TCCAGAATCC 


TAATTTTATT 


TTTTATTAAA 


AAAAAATAAA 


ACCCCCTTAA 


CTGACGGGCG 


720 


GTTTTTAAAT 


CACCTGCCTT 


CAAAACCCCC 


CTGGAAATTT 


TTAAAATTTT 


TTTTTTGTTC 


780 


CCCAACATTC 


CTCCCCCCCT 


AATAACACCT 


GATTGATACC 


CACCAATTTT 


CCACTGTGGG 


840 


TGATTGAGGT 


GGTCCCCCCT 


CTTTTTTGCC 


GTTTGATTTC 


CCCCGTTAAA 


AAATTTAGAA 


900 
904 



AAAG 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 917 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

AAGGGGGGNG AAATTTAGNG GACNAAAATT ATTCCTTAAG GGCCNCCTTT CTTCAGGGAA 60 

NANGGGGGAA GGAGATANTN CGGCC CTTGT CCGCCTTTTN GGAN AC GAT A GGGNCGGTTC 120 

GGNTTGGAAA TTTTTCCTCC AAAATTN C C A ACAAAAATNG TTTTTCCCCT TCCTTCAAAA 180 

AGAAAATTGG TTTTTTTGNN GGCTTNGGGG NGTCNGGAAG TCANAACCCN GNGTATTATT 240 

GCNTTCCAGC CCCACCCGTN AGTTCATTGG TAATTCCTAT TCGTTCGGNT CAANATAATT 300 
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CGGNACTTCC 


GCTTCCNAAT 


GGATCCCTTC 


AANGATTNGG 


TTTTTCCGGA 


TTATCGCAAG 


360 


TCCCCNGGTT 


NTCCAATCCG 


GAG C GCNTC G 


GATATTTCCG 


GNTNTCCGTG 


CNTTT CTAGC 


420 


CCCACCCCCA 


NGACCACCNT 


TGGTTNTTTA 


GGTGGGTCTT 


TGATCCGCTT 


CACGTTGCTT 


480 


CAGTGACNTA 


GATCCTTNTT 


CGGTCTTTCC 


GGCTCATTTT 


AGTCTCGAGT 


TATTCTCAGC 


540 


TGTGTTANAA 


AAAAAGANNA 


NAANAANCTC 


CGCCTCGCCC 


TTCCGNTTCG 


GTTCTTTCCG 


600 


CNNGCN V 1 U(j 




NTCTGCCTTC 


TCCACGTGAC 


GNTTNTTCGG 


CNTCCCAGTN 


660 


ACCCCCTCCN 


TCCACGCCTT 


CNTCCAGNTT 


CAGCTTNTGT 


GCTCGTCCCG 


GNTGTGCCGC 


720 


CANNTNGTGT 


CAATTCCNGA 


CCGCGGCGGG 


GGCCGGGCAG 


NTGGGGNATN 


TAGGGCGGGC 


780 


AGACAGTCGG 


CCNATCTCCA 


TAGGCCGTTC 


CCTATNCTNC 


CCTGATTTTT 


TTAAACCATT 


840 


TCCAAAAGCT 


CGCTGTCCTC 


TTTCCGGGNC 


TTCCATTNNG 


GNGTNTC CAN 


AAGGAAGNAA 


900 


GNCNAGTAAA 


GGANCTC 










917 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 835 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 42 : 

GGNCCCCTAN NGATTGGCCN TT GAT CAAGA NGGGACCATC CTGNACCTGG NGGTNGNTGT 60 

TTCCGCTTGG GACGGAGATG GTT GTTTTTG CGGAGTAGTT TCNGNGGGTT TGAGGCGCGG 120 

NTANTTTTTT TGTTNTGGTC CAGACCGTTT TGATTTAGCC GCNGCNGACA GTAAT GGGGC 180 

GATACCTCAG NTCCTTGTGA ACCCAGGGTG CAGNTGGTTC AGCAGGATAG AT GTACAGC C 24 0 

TCCGAACTTT TCAATTCCCN GACTAACCAT TGATGTCAAG TTGAGTGTTT AAATGCTTGC 300 

TACCAAGCTG GTTGGTAACC TGAGTTCAGT CCCTGGAACC CACAT GGGGA GAGAGAACAT 360 

GCTTCTGTAA CTTGTCCCCT AACTACCCCC AATACACGCA TGCGCGCGCG CGCGCACACA 420 

CACACACACA CACACACACA CACACAGAGA GAGAGAGAGA GAGAGAGAGA GAGAGAAGCA 480 

CAAACAATAA AAGAAAAAAA TAAAATCTCA TTTAATTTTC ATTAGTATAA TACCTTGATT 540 

CTTT GAATGA CAGCAAGATA AAGTAAACCA AAGC AC ACT G TAGAAGGGAT TACGCAACTG 600 

AAAAGT GACA ATCCTTACTC CAGCCCTTCC TGCTATGTTG GCAGTCTTGC TGGGAGCCAT 660 

TGATCTAATC AGTTTTATTT GAG GC AGGGG CTCATGTAGC C CAGG AGGAT GGT CAAATCC 720 

ATAGCT CAT C TGAGGATGAG TTTGAACCTC TGACCCTCCT CATTCTCCAG TTCTCCATAT 780 
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CCTGAGTGCT GGCACTGAAA GACNCCACNA GTAGCCTTGG CAGGCTAGAA ANGNT 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



180 



480 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

GTNTTTTNGC CGNGGGAATT TAAGGGNGAT TTGGAGACTT TNGAATTTTC GAANGTTCCA 60 

AAATAGANNT TNAGGNCAAT GGGNTTGGGG CAGNGGNGCT TTTTTAAATC ANANAAGTAT 120 
TAGATTTNTA T GGAAAC CCT GGGGGTT CCA GTTTAAT CCC TTCATCATCT TGAAATATNA 

CTTGTTTATG GGAANGGTGN GATAGCAGCC NGAAACAGAG GTTTTTATTA TTACTGTTAG 240 

AGANGAGGAT TGGGGAATAG AACAATGAGA GTCTTGGTAA TATTNTTCNG GAAACAACNG 300 

ACATAATTGG AACATTAAGG AAATATATCC ATGCATTCTG TACTTGCAAA TTGCTCCAAG 360 

GAAGATGGAG AGTATTGTAT TTCAGATAGA GATANGACTA TACCTGTTAT TTTTTT C ATT 420 
ATAGCAACAT TAAAAAAGAT AGTAAT CTAA TTTCACATAA CCATTACTAC TAAAGTATAT 

ATGTANTCTT TGTTTATCAG GTTTTACTTC TCAGAAATTG CAGCATCTCC TACAGAGCCT 540 

GTCAAATGAG ACNGCATAGA TCCCCAGAGA ACAGAGAGAC TGGGAAATCA TTGAAATTAC 600 
ACAATCCTAT CCCAAATGTT TGCGTAGACT CAAGCTC GTA TCAGCTCATA AGATCAGTGT 

GTGTGTGTGT TTGTGTGTGT GTGTGTCCCG CACATGCTTG AGTATGCATG TGTGCATGCA 720 

TGTGTGTATG TCTATTGCAT TAGTAGAGAT GTTAAGGTTG AAT GTATTTT CTGCTCATGG 7 B0 

TCATTGTAAG ATATTGTGCT GTATGT GAT A AGAAT CAATG TAACAAGGCT GGAGAGAT GA 840 

CTTCAGCTGT TAAAGGCTAG ACTCACTACC AAAAATAGNG CNATCAGTGT GAANTTCCCC 900 
ACAGGAGCTT AGCAAGNTAA TAGG 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 435 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



660 



924 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
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GATTCCAGAG AGAGGAGTGA ACTGGCAGAT AAGGCAGT CA GCATAATGGC TTAGATACCA 60 

TGTGCTTTCG CT C ACT AT GC ACCCATGACA CAAGATCACA GGGTACAGGC CTGGACCATG 120 

GCAGAGTATA CACTGGTTGG GTAAATGAAG AGGAGAGACA GAGTGGGAAG TCGGCTTAGT 180 

GGATATGGAC TTCAAATTTG ATGAACAAGC AATTCAAATG AGTATCGTGG GCTTGANTGG 240 

TAT GAAGACC CGTTTGCAAA GCAGTGGTCA TAAGAGAGAA AAGAGAGAGA GAGAGAGAGA 300 

GAGAGAGAGA GAGAGAGNAA GAGAGAGAGN GTGTGTTGTT GTTGTT GTTG TTGTTGTTTA 360 

TTGGTTNATA ACAANATNTA CCTTT GGGCN CTTTNGAAAG ACTNTNCACA AAGGAGCTTG 420 

NCAAGCTAGA AAGGT 435 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 919 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doubl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

CCCCNGTTAC CCNGANGTTT ACNNGTT GGA TTAAANGGGN NNNAAAACGG GTGGGGNNAA 60 

AC GAATTTTT TGTNCNCGAC CCNTCCCCGG TTGGGGNTGG NGAAATAAGT TTTAAGGTGG 120 

GAAANGGAAA GGAAATAAAA ANATTTTTTT TNAAGGAAGT TCCTTNCCAC AAAAAANTNG 180 

NTTNGTTCAG TAGGGTTCGG GCCCGGGAGG NAAGGCAANN TTGAANTNCA NTTAAAAATT 240 

NCCNGGAANG TACCTT GGGN AGGGATTACC NTGNAATTTN TTTAAGAAAA NNT GGGTNTT 300 

TTGGGGNGAT TTTNNGCCCC ACCTGGACCA NTTTNGGGAA ANGCAGAAAC GTT C CAGNGN 360 

GTTTTCCTTC CAGAGAGAGG GTTAGGTTCC TTCAGGGGNT TCCAAGGACG GGGACCAGAA 420 

NGTGAAACAA AC C AGGNT NT GAAGAGACCA GNCGGGGGGG GGGGAGGGGG CCGTTNTAGA 480 

TAGATTGAAC CTGCAGAGTT GCCTGTTACC TGAAGTTGTC ACCNTTTNAC CNACANACTT 540 

NATAAANNTN TGNTGACCAT NTCAGCAAGT GTCACCTTCG TTGCCAGGAC ACAAGTTTCT 600 

TAAAGCTTAT TTCAGTNTCA CCCGCTGGGG AGANACATTC AGGGC AT GGG CGTCCCCCAG 660 

CCNTCGGGGA GAATGTGGGA GGTGGCGATG TGGGAGGGAT TCGAGAGAAG AGAATGCTTA 720 

AGAACCATCC AGGGAACCTG TGCGTTTGAA GGTNTGAGTT ACACACAGGC TGCTCAGGAA 78 0 

GGAGCTAGAG CT C CAAAT AG GAGCT GTGAT CAGGCT GTGT GTGTGTGCTG GAAGGGCCAG 840 

TTAGCAGAGG TTGTNTTGAC CACCCAGNCT ATTGAATTGN GNNTNNTCCC AAANGGANNT 900 

TTGGCAAGTT AATGAAGTC 919 
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(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 915 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
TTTTTTGGAA TNTTGGAACC NCGNTTTGGA AGAAGAC CTT TNNNNTNCAA TTGGGGAANA 60 
ATAACCGGGG CCAAACCTTG GGAAGGGGGG AAAANATTCC NGGGGGGAGG TAATTTNTTG 120 
GNNGGNAGGG GNGGAGGTTA NTATNNCGGT TGNGGAAGTT TGGAATTGTC CNAANGGATT 
TTGTTTAAAA AGAGGNTTGC NGGGCNTGNT CCCTTCAACC ANGAGGTGGG GCCNTTGCAT 
TTATTTTCCT TTTAACNTTT GAAGGTGAAG CCGGGTTATT TNTTTGTCCT TCGTACATTT 
ATCACCACGG NGTTTAAAAN GTNTTTTTAT TTCGNTTTNA TGGAGGNGAG TTAAATNTCN 
ATTTCCAATT AAACCTCNGT GAAACCTTCT TTGATCCTGC CTNGTGTTTC CTGAGTGNGA 
CATACCTGCN TAGTTNTGGC CTTCCCTTTC CTTNTCGTCC TTCTTCCATT CCCTTCCGAA 
GATTCCTGAA GGAGTGAAGG TTTGGGAAAG GGGGAGGGAC AGAGTGTCCA GGGCTTGCGT 
GTCAGTAGAC ANNAAANAGC CGNAGGGCAG CCCGGGGTGA AACCACAAGG CAGAGGCCCC 
AGGGTAGACA GCT GACAGGC CCGCCCACTT TGGCTCCTGC NTTCGCTGTC TCACCCCAGA 
ATTTTCCTGG CAGGAGT GGA AG AAGTT GGT ATCGAGTCTT TGAGCCCTGA CTCATTNTCT 
GTCCTAGCTG GGTGCTCCTC AGTTACATCT CCAAGTGTCT CTCAGGGGTT CAGTGTTAGC 

t 

CACATGGCTG CCTCAGNTCA AACCGGAAAC CCAAGAGGCG GAAACATGCT TCATTTAATT 
CCCATCTGGG GACCCNTACA AATTTANGGN TTGTACTNAN GGATTNCCAC AANGNNAAAG 
GCNAGNTAGA NAGGT 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 849 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
915 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GTTAAANANG AAAAAGNGGG GGT GACAGGG GGNGANACCC NTTGCGCCGG GCTATGGATT 
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NTNGGCACCG ANAAGATTTN CAGGNGACAN GGAAGGTGGN NGGGGANGGG GGAAAGTTTN 120 

GAGGGGCCAA AAGGANAAGG AGGANGATTG ATTGGTTNGG GAGCAGTACT TGGAAAGAGT 180 

GTGTTNGATC GGNAAACAAC CACGNGNAGN GNGTTTTTGT T GCAGC AGAG ANAAGNGAGA 240 

AAAAGATNTC AGGAGATCTT GATTTTTTTC GGGTCGAGCT ANGTTGGGGG ATGNGAGGGN 300 

ACAATT CAC A AGATTTGTTC ACAGGGAGNT CNAGGAGGTG GTCCCANTAG CCGGTAGGGG 360 

GGTTTTCTCA ANAAATGGGN TCAGT CAGGT GNTTGCCTAG ATCTTTCATT AGTTCCTCCC 420 

TTCAAAGGGA NTTTGAAGGA GTGCTTTGTC CTGTGGAGCA ATTGACTCAA TCAATAAACN 480 

TAAGTAATCT CCCGGANTAC TGNNGANGCG TTCCCAGAGA GGTCCCCCGT AGTNACCAGT 540 

GAATCACAAT TTCCTAACCA TANGANTNTT GTTAATCTCA CCACATAAAC CCACAATTCT 600 

CGCGTCCTTN GTGATGGTTT CAAAGT CNGG AATATNTTTT CCTCCATCCC TCCTTTCCTT 660 

CCTCCTTNTA TCCCTCCCTT CCTTTTTTCC TTTCACAGGA TCTCANNATG CAGCCCAGTC 720 

AGGCCTTAAA CTTGTGATCC TCCTGTCTCA GCCTCCTAGG TGTTAAGATG ACCCAAATGT 7 80 

AAACCATGTC CAGNNACTTC CTCCTAATCC CATCTTCAGA TAT CCTTTAA GAC CAAATTA 840 

AATATTAAC 849 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i» SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 925 base pairs 

(B) TYPE: nucleic acid 

(C) ST HANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

AAAAAAANAA ATNTTGGNGG ACCNAANACC ACCAATGGGT TTTGGGGTCC GANCGNNCAA 60 

ACNTGNTTTC ANTGTTNTTC TGGNTTTNTT TGNNTAAACT TGGGGTTTTA AGGGTTNAAG 120 

GTTCCAAACC CNATGTTTTC GCNCAATTTA GGCGGGGNGG GGAATCCNTT TGGGGANGTT 180 

TNAGTATCTA GTTAAGAGGG GCCATTTNGA GATTGACACC TGAGTTAAAC TTCNGAACNN 240 

AGNTGTNTAA TNAACCCGTG AAGGGGCTGA GGGGNGTTGG TTANGATNCT CAATNNTAGG 300 

GNAAAAANNA ATGTGGTANG GAGACAGTAG NNTANTCGGA NCAANTNCGC ATCGGCCNTT 360 

NNATTAATAA GCAGNCAATT GAGGAGGTTA TCCACGACAG NGANAGGTGC AGACCCCACG 420 

CACACTGTGA CAGTGGTTTA TGTNACANNA TNTCGGGAGN GATGGNGCCA CACCNACTGA 480 

GTTCCGTTTT GTTCGGNTGA AGGTAGGNCA ANACTGGCAN AGGTGTTNGG GGGCNAGACG 54 0 

NGAGATGNGG NTTGAGCNTT CAGACCNAGN TNCANGGNNN NGGACNANGG TCCCCNGNGC 600 
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CNTTCTAGCC TNGAGCAGNT TCNAGAGAAN TATTCGNCGG GTATAGGT CG CCCCNANGAC 
GCNAAACGAC CGNGAGCGAG GGCGGAACAG CCAATCAGTT CGANTTATCG TGTNTGTTNG 720 
CGGGGTTTGA TCCCNGAGTT AGNTCAATGA GCCCANAACC CTGAGTGGAG GNACCGTCAT 7 80 

GGGAGGAGAG GNGAGTCACC NGGTACCTGG CATACNGATG GAC CATCC AG TANTTGGATN 840 
GGAGGGCGAT ATNGTNANTC TTAGGGGNTC TCCTGAGGAG GGNATACCCG TGAGTTCCGT 900 
AAGGGCGTTN GCAAGTAANA AGTCG 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 827 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



925 



240 
300 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GCCAGTTGCC CTCAGATGNC CNATACCCCA CNGGGGGNGT CTCNCCCCTC TCTCAANTGT 60 
ACACACACTT CCCCATAGAC ACNGGGGACC ATAGCTCTAG GGGGAAAACA AAATNTTATN 12 0 

TGTGTGTGCA CNTGTGNGTG TGTGTGNTGC CCCAAACACA GGGGTNTCTC TTCCCCAGNG 180 
GCCCTAAAAT GTTNTNTGTT CNCCACTNGG NCCTCATNTN NACATACCCC CCNNGNCTCN 
GNCCCNNATA CCCNGACANN GAATGTGTGN NTNCCCATNN GCGCTNTCAC CACCACAGNT 
TTTNTAANAC ATCTCTCCCC NNNAT AT CTN TTNTTTNNTN NGGGTCTCAA TGGAGACNAC 360 
ATATACACNA GTGTGTNAGA CACACCCCCA CACCCCAAAT GNGCGGGGGG AGGGCTCTTA 420 
GCGCAANGAG AGNGCAGNGT GCTTACTCCT CGCCCCCTCT AGAAAACTCA CACTNTTNAG 
ATCTCGGGAC TCNNCCTCAG CNCATTCTCT ATCTCCCANA AANACACAGA GNNACCCTNT 
TT GNGAAAAC TCANNTGTGT ATAGTGCTCT GNGTGTNACC CCNAGNCCAC ACCCCCATAA 
NANATNTNTC TCTCAAAACA TGTGCATGNG CGTGTAACAC TCNCCATCTC TCGGGCNNGC 
TCTCCCCNTN AC AT CT CT CG N GNNAAN ANA AATATATCCC CTCNNTTANC CCCCGTGTCC 
NGGANAATAT TNCCCCCCTG N GAC CANT CC CTCCCCGGAG ACCNANCCCC CCCGTGGANA 
CCCCCCCCNG GNATCAACCC CCCCGGGTAN ACAACCCCCG GAACCCC 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 899 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



480 
540 
600 
660 
720 
78 
827 
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(ii) MOLECULE TYPE: DNA (genomic) 



PCT/US97/06067 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: 



AAAAATTGTA 


AGGAGTT GGG 


GGNATCCCCC 


ATAATTNAAA 


NAGGGAACAA 


NCCNTAAAGG 


60 


GAGGGNNGGG 


AANGGCCAAN 


ATT GGNTTAA 


AAANAGTANG 


TTT GGTT GAT 


CCAMACACAA 


120 


GGAATTTGTT 


ANAATTTTNN 


TAATGGAAAT 


NGGGCACTTC 


AATTGGGANG 


ATAAAACCCC 


180 


AGGAAGTGAT 


ACCNGGGTTA 


TCAAGTNAAA 


CNTGATTCTT 


GGNGNNGAGG 


GAAAGGATAT 


240 


TGAATTTGAG 


TGAGT GCAGG 


TGAAGTGAGA 


CTTGGGAGNA 


CAGGTCATGC 


CCACCCAAGG 


300 


GAGGAGCAAG 


GGNTGGGCAG 


TGTAGGTGGT 


GNGGTGGTCC 


TTCCTGGGGT 


GGGCGGGGAG 


360 


AC AGAT GAGA 


AC GTT ATT GG 


AG GACAGGC A 


CAAGT GTTAC 


TGAAATGCAA 


AT C CCTGTAG 


420 


ATNT GGAAAA 


GTTCTGGNTT 


CAGGCTTGAT 


GCTTGGGCCG 


GCAACTGTGN 


ACTTTCCCTG 


480 


TACGTTCAGC 


CCCCCCACCC 


TTACGGAAGT 


TNTCGTCACT 


GAGANTAGTG 


GCTAATCAGA 


540 


GTCTTCAATG 


GACCTGCCAA 


TCAGAAAGGA 


AGGCGGGCTT 


TTCCGGGTGC 


NTAGGTGTAG 


600 


GATTCGCTCA 


GTAGTTAAGC 


AGTCTTAACT 


GGTTNTGGCT 


GCTGTGCTCT 


CTGTCCTGCC 


660 


GTT GGATTNT 


NT G AGG CAT G 


TTCAGGCAAG 


CTCCAAAGTT 


GCGACATGGT 


GAG CACAGGG 


720 


GCAGGGGGGG 


CGGGCGGACG 


GGCAGGGGAC 


TGAGCAGTGG 


GAGCTGGTGT 


GGTGGGTCTT 


780 


TCCCGGGGCT 


GAGTTGGAAT 


CCGCGGCTAC 


CCGTGAGGTC 


TTAGCCACTC 


ACTAGACCCA 


840 


GCGGCAGTTT 


CTGAATAACT 


TTCCTTGTAG 


GGGCTGCAAC 


TCTTGAAAGA 


CCCCACCAG 


899 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 852 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



AAAACATTGG 


CNAGACTTGT 


AATAATTNCC 


NGTTNGGGGA 


AAANAGNGGN 


NTGNGCTTCG 


60 


GGGGNGGGGA 


NCCGAGGTTC 


CCCCCAAATT 


TCTTANNAAT 


TGAGGGANAT 


TN AN GGGGGG 


120 


AACCGANNGN 


TCNNNAAGGN 


GGGGTTTTTC 


CCNTTNGCCC 


CCTTGGGGNT 


TNACAANTTG 


180 


ACCNTNAGTT 


AACGGGGANA 


ACCCGCCNTG 


TCCTNNGGGA 


GGGGGGTTCC 


CTNGGGAGTT 


240 


NCGTNGTGGG 


TTTCAGTTCG 


GACCAGGTCG 


TTNACTCGAA 


AACNGGTCCG 


CNGTATNCAC 


300 


CCGGTNGGCN 


GNCTGTTGAN 


NGCTAACGNG 


GTAAGTATTT 


TCATGTGTCC 


GAACGTGTTA 


360 
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GACTCCAAGT ATGGCCATGT GCANGAACCN CCGGTTAGCN AGACGCAGAG CGTGATCNGN 
GGAGGNTCTN CAGGNGTCCA ACCNGGNANG NCAAGATNCG TCGACACTGG CAGNACCCAN 
TGGNGACTGG NNGATCAGAG GGAGNCAGGT ACGCNGGGAA ACAGAGTTGN TGNATTGGAT 
CCGGNANACG GACANNCNAG NGGGNCNGTN GTTTGGTATG TGNGCTAGNA GGANGCCAGG 
NACAGTCGGA AAGGNTGTCG GGAGGNTCNG ATCATGTCNT ACATAACCNC TCGTGAGTAT 
GCGGTGGNTG TGGAGTTGNG CAGGCGGCAG NTAACGCACC AGAGAATTCN GATNTNTCCG 720 
CAGATCGACA GAT NT GTTAG GTGGGTCTCT GACGTTNAGG NCGANAGGAN NNGGGAGNGG 
ATAACANTNT CACACAGAAT TTCACTGAGG CTGAAAGACC CCANTT GTAA NTGNCCAAGC 



TAGCTGAAAT CG 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 967 base pairs 
( b ) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



420 
480 
540 
600 
660 



780 
840 
852 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



AAANCCTTCC 


CGGNGGGGTT 


AAAANAGATT 


ANGGGTTTTC 


CGNGGGGAAN 


CCCCNNCCNC 


60 


CGCCTTCGTA 


ATTTGTCCCC 


AAGAAAAATT 


CCCGCGCCCN 


CAAAAANNAG 


GGGANTNGGG 


120 


GAAATNTTAG 


NGGCCANAAG 


NAAAAAAGAN 


AATTGTTTNG 


TTTTGGAGNC 


CACNNCGNAA 


180 


NAGGGGGTNT 


TAAACGCAAN 


AACAC CGGGG 


GGGGGNTTTT 


TNTTNCAACG 


CGAAAAANGC 


240 


GGAAAAAGAT 


TTCAGGANAC 


NTGAATTTTT 


TNGGGTCGAA 


GTTCAGTGGG 


GGGATTGGGG 


300 


NGNNAAAATT 


TNANACNGAT 


TATTGGTCCN 


ACCTTTCTCC 


TTCCCNTCCC 


TNCCAAAATT 


360 


TTNTCCAATT 


TTCTTCTTTN 


TNTCCATTTC 


CCCACCAGGA 


GGGAGTCACC 


CACCTTNTGC 


420 


NGCAAC ATT C 


TCAGGGTTCT 


TCATTCTCAG 


TGTAACAGCA 


GNTCTTCNGG 


TTCTNGGGNA 


480 


NTCAGAAACT 


GGGCTGAATC 


ATGTCCAGAG 


TTGCNGAGTT 


CCCACATAAC 


AGATAGTGTT 


540 


NGNGAGATTC 


TCAGTCTAGA 


AC CAT GT GAG 


CCAATCCCCA 


TCAAATCTCT 


TCTCTCANGN 


600 


ATAAATNNAA 


ACATN CTT AN 


GGGAGGCT CT 


ATTTCTATGG 


AGAAAC CAGN 


ACCCATATTT 


660 


NGGGCTGGAT 


CACTCTTTAT 


TTCCATTATG 


GGAT GTTTAA 


C AGTAAT CCT 


GGTCTGCATT 


720 


CCNTAGGTGC 


CAGTAGCCAT 


CTCCTAGTTG 


T GAC AAT CAT 


CATTTTCTGG 


GGATGAGGGT 


780 


GGAGAAGGGG 


GCAGATATCA 


AAACTAT CCT 


GNATCTAAGA 


AATGTTAGTT 


GAAATGAAGT 


840 


T GT CAT GGGT 


CAT AAAGT CT 


AGGATAAAGA 


GT GAT GAGAT 


GTCACTAACC 


CAACTCTTTT 


900 
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GGCCAGAACT CAATGAGGTN GTCCCATTTG ANTTACCCCA AAGGNGCNTT AGCAAGTAAA 960 

967 

AGGGNCG 

(2) INFORMATION FOR SEQ ID NO: 53; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 700 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

GGNGTGCTGG GATTATAGAT GCACTCCCCC AAAT C CAGCT TTTTACCTGA TACCGGAGGA 60 

AGGAAC GGAA GTCCNCCGGC TTGCACCGGA AGCAGTTTCA CCCACTGAGC CATCTCCCTG 120 

GTCTGTCTGT CTCAGCTTCC TGAGCTGGTG TTATGGCTGT GCACCACCAT AGCTGGCTTC 18 0 

TTTATTATTT ATGTATGACT NGGGTCTNTC TGGGGGTCTG TTAGNCAGTC TGTTAACTAC 240 

CAT CTTTT GN CTCAGGCAGC TGCAACAGAA AACAACNGGC TGTAAATNGT TTT GACAAAT 300 

GGGTCTGGGG AGAAGTCTGT NAT GC AGGGA GAT CTNGAGT TTATNCAGAG GAAAAGGTGT 360 

CTNTCAGNGN AT CTAGGGNA GCATNTCCTN TCNGCGTCTT GGTTTGGGNG AANGANGGAT 420 

CAAGAGCCCC NNAGCNNNNN AANTTN CCNT CGAGCAGCCC AGGGATTTTN GCTTTCAACG 4 80 

NAN CTNNAGG GAACCCCCNA NCAACCTNGG CNACAATTGG GGNNTTTCCC CCNCCCCCCC 540 

CGATTACTTT TNCAAACCNT TGCCACNCCC TCGCNCNATG CCNANCCCCC AAAACGTCGT 600 

NNTT CAT AAN CNCNNCNCTC NCNCTTNNCC CAT GGGGN GC ACACTCCCTT CNCCCNCNTN 660 

TNTTAACNGG NGGCGCAAGN CCTTTCTTNC CCCCTNCCCC 700 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 229 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

NCNACGAGAN GTCAANGTGN AANCTGNCGA TGATNAAAAN AACCGANCTT AGGGTGNCAA 60 

NGGGTTACCC AGGANGGGGN CAAAGCAAGN TCCAGGCCCA TNANGGACCT GCTGGTNCAT 120 

NGCCNGNAAA NACCTACTTA TCCTNGAANA GCCCGAAANG TCCGCTNNGA CCANNTAAGT 180 
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NCANNNCAAN ANGNACCACN CCNTTAACAC CACCGTATGA NCCCNAANT 
(2) INFORMATION FOR SEQ ID NO: 55; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 465 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



180 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
CCCCTTTCGN NGGCCTCAAT NANTNATTGN CT AC CCNANA GTGGCGGTCT NNCATCATGA 60 
CAAATAAANC AGCCTTCATG AAATACGATG GCGGGGGGAT TAGAGGNNTT TNTT GAAAGA 120 
GCT GAAGGGG CTTGCAACCC CATAAGAACA ACAAT GCCAA CCACCCAGAG CTTCNAGGGC 
ATTAAAACAC TACTGAAAGA C TAT ACATGG ACTGACCCTG GNCTCCAACT GCATATGTAG 240 
CAGAGCAAGA GCCTNGTTGG NGCACCAGTG GAAGGGGAAG CCCTTGNTCC TGCCAAGGTT 300 
GGNCTCC GAG NCCAGGGGTA ATNTNGGGGG CGGNGGAGCA GTAAGG GAG G GTGGATGGCG 360 
GGGCTACCCA TATNGNGTGG CGGAGGAGAT CGNNGCTNAT GGACAGGAAA CTGGNAAACG 420 
GGAATNACAT TGGANATCTC NATAAAGNNN NCATTTCTTA TTCNA 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 564 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



465 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

TTGGGGCCGN TNAACTCTGN GTNNNAGTAT NCCCNANAGG GGGGGTCTCA CANCGGGTCN 60 

CACCNCATNT GNGGGNGCCC NTTCNCNACA ACACATTTTG TCNGGNGGTT ATAGNGAGAG 12 0 

CACANATTTT GAGAGTCNCC NGANAGGGGA GAGAGACNCA CACNAGTCTC TTCTCCCCGT 180 

GTT CGCGAGN GNACNCTTCT CTNCACATCT ANAGTATANC CCAGNGTCAC ATATGTGGCG 240 

GGGGGGTNGT GTCAGNNACA GNGTTTCCCC CNCCNGTNTT TCCCCCTNCC CCCCCCNCAG 300 

GGGNAGACAA NGTNNTAGAG AGAACAGGGG TT AT CCAC AC ATCNCACTGN GNGGCACAGG 360 

AGGANNANAN TTGTGCTNAG AGCCCCTGCN CTT CTGGTGG TANCTCTGGG GCCCATATTC 420 

TCTNCT CTGG GTCCCCCCCG GGGGGGTGTN NCCCTCNCCG GGAGAGAGTN TTAGAGANAA 480 
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ATCTCCATCN CANATGANAA AATNTGNGGG NGAGAANCCC GGGGGATATC ACTNTTTTAN 54 0 

AANNGACCCC ACCCCCCCCC CCCT 564 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 822 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

GATTT GCNCT CATATNTCNT TTACCAAACA GNGGGNGTCT GCCCCCCTGT NAT ANAC CTC 60 

TTGTTNTCGC GGGGTGCTNN TNGGGGCCCC CCNTGTAGAA AAAGAAC AN N NGNTGTGGGN 120 

GGGGGATTTC TCTCTGNTGT AGANCTNTNC NCTGAGACAC ACAGNGCCCT GTGTGGGGTC 180 

CCCCTCNCCG AAAAAGANAC CCCNAAAAAA AAAAAAAAAN AGACCGCGNG GGGNNGAAAA 240 

ATATCTCTNG NNATCTTCTC TCTAANCTCG CTTTTANTCC TCAGAAAACC CCACCCCNCC 300 

NCTCTNCCCA GAAATATNAT ACANNNNGNG TTCCCCTNCC CAAAACCCCA AAGGGNNTCC 360 

CCTCTCNTCT NCCCCNAATA CTCTTCCNCC CCTTNATTCT CNTATCTCTN NGGACTCANA 420 

CTCTAAAACA CANGNNNCTT NTCTGTGCCG CAATNTNTTN TGTNACANGG CNCCCTGAAA 48 0 

AAAACCCCCG TGTTCTCCAC ATCNCCTCTN TNATATCTCT GCCCCCTTCC NCTATATCNC 54 0 

TGNGTTTATA ATTTCCAAGG AGAATGTNCN CAGGGGGGCC CCAATCTCCC CCCCTNGTTT 600 

CNNCGAGNAG GGCTCTTTTN TAT ATT TTTN NTCNAAACCN CCNTTGTCCT TTTAAATNGG 660 

CNTTNACNCC CNGNCCCNCC CAACNNCCCG ANCGGGGGAA ACGTTCCCCA NTTTTCCNTT 720 

TCCCCCCGCC CNCCCNNACC CCAATNCCCT TTTTTCGCGT TCCGGGGGCC CTGTTTCCCT 78 0 

AANCCCGGAA TNAANTNCNT TNTTCAANCC CCCCCCTTTT TT 822 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 553 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
TTTGGGTGCG GTCTCCTCTG TGTTAGTGTA TCCCCCATAG GGGGGGT CTC ACAGGGAGCC 60 
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CTTCTCTTTT 


GGGGGGTTAT 


AC ACAG GGGA 


C AC AC AT GTG 


ATATAGAGAG 


AACACATGAG 


120 


AGTGGGAGAG 


TGGGGGGGTG 


GGTGGAAGTG 


AGAAACAGAG 


AGAGAGAGAC 


TTTATTTTTT 


180 


GTGGTGTAAA 


AT GT GTTGAA 


TCTCTGGTTT 


GATAAATTTT 


ACACATTGGG 


GTTTGTGTAG 


240 


ATCCCTGATC 


TCTCTCCTAT 


CCCCATTCTC 


TTTCAGAGAT 


GTGTCTCTGG 


ATTCTCAGAG 


300 


AGATTTTCTG 


GTCTCACATG 


TTTGGTCCCT 


TATGTTCTCA 


CTCTCTCTTC 


TTTATTCTCT 


360 


GATACAT GTG 


CTCTTCCCCC 


TTGGGTCTTC 


TCTCTGTCTC 


TGTCTCCCCC 


CCCATGATAC 


420 


ATAGAGTGTG 


TTTTCTCCCC 


GGGGTTTCCC 


TTGTTCACAA 


GAAGAGCTCT 


GGGGAATCTC 


480 


TATCTTCTCA 


AGGGTATAGC 


CCCCCAGTCC 


CCAGGCCCTT 


TTTCTTGGAA 


TTTTGGAGGG 


540 


GGTTCCCCAT 


TTT 










553 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



GGGATTT GCT 


CTCAGATGGT 


AGTTTACGTA 


AACTGTGGGT 


GTCTTGCCTC 


TCTCTCAAAA 


60 


CATGTGCGCG 


TTTCTGGGCC 


CGTGCGCGTT 


TTCTGTGCTC 


CTCCTTCTTC 


ACTTCTTTGT 


120 


CGCGGGGGCG 


CTCGCCCCTG 


TGTTTTCTGT 


GCTCCTCGGG 


GAGATGCTCT 


CCCTTGGGGC 


180 


TGTGGGGCTC 


TGTGGCGGTG 


GTGGCGGTGT 


CCTCGATACC 


GTGCTTTTTT 


GTTTTCTCGA 


240 


GATCTTACTT 


TTTCCTCTCC 


CCCTTGTGTG 


TTTCTTGGGT 


ATACACGAGA 


TTGTGTGTGT 


300 


CTCTTTTCTT 


ACCCCCTCTC 


TAGTTTATAT 


TCACACTTAC 


TCTCTCTCTT 


TTCTTTTTCT 


360 


CTTTAGATTC 


TATCCTTTGT 


GCACTTTTTC 


T ATT GT GCT C 


TAGATTTCTC 


CCCTTTTTGT 


420 


TTATTTCTCT 


TCTCCCTGTG 


TCCAGTGTGG 


TGAAAAAGAC 


CCTTATTAAA 


TTTAGACTTG 


480 


TGCGCTCTCT 


TCTTAAATTT 


CATGTGTTCT 


ACAGTCTCTC 


TGCGCTTTAG 


ATATTTTTAG 


540 


AAGCGCCTAA 


ATCTTTTAAA 


PJ GTGTGAG 


ATCTCTTTTT 


TTTTTTTACA 


CTCCTTTGTT 


600 


TTTTCTTACT 


CCTCAGGGGC 


ATATAAACCC 


CCCTCTCCTT 


TAATATTTCT 


CACTCTCTTT 


660 


CTTTTCAAAA 


AAATTTTTCA 


ATCTAAATCC 


AAATTTTTTT 


TTTTTTTTGG 


TGGCCCCTAA 


720 


TTTTTGGGAA 


CGGCCCCCCC 


CCCTCCTCTG 


GGCCCTCATT 


GGGGGGATTT 


TTTTAATTCC 


780 


C GT AAAT AAA 


AAGGGTCGGG 


CCCTTCTCCC 


CCCGTGGGGT 


AATTAATCAA 


GGATTTTAGG 


840 


GTTGGTAAAA 


ATTTCGGGTT 


TTGATGGTTT 


TGCCCCCCCC 


TTAACCCCTC 


T *r *^^*r ^r^^^r t 


900 
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TTTT 904 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 696 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

CTCAGCACTG AAAGAGATAG AT TAAAAAC A AAACAAAACA ACAACCAAAA AAATACAAAC 60 

AAACAAACAA AAAAAAACCC CAAACAAGTC GCTCAACTGT CTTGAGTCAA TAGATTTTAA 12 0 

AAAAT GAGTT AAGGTTAGGG TTAGGTTAGG GTTAGGGTAT AGCTCAGGCA GTAAGGTACT 180 

TGC CAAGAAT GTTTGAGGAC CTAAGTTT GN CTTTTTTCTT TCTTTCTTNT GAAACAGGGT 240 

TTCTCTGTGT AGCCTTTGNT ATAGACCAAG GCTGGCTTCG AACT CAGAGG ATCCACCTGC 300 

CTCTGNCTCC GAGT GNCAGA ATTAAAGGCA TGTGCCATCA CTGTCCAGCT CTTAGGTATT 3 60 

CATTTTTCAG CTTATAGTCT TTTGGCAAGG GATGC CAGGG NAGGAACCAG AGGCAGGGTT 420 

GAAAAACAGG CCACNGNGGG GGGAACGCTG CTTCCCCGGG TTATTTTCTT GGGTCANATC 4 80 

NT GTGGCCTT CCNGGGGGGT CTTTCCCCTT TCAAAATTNT TTGGGNTTGG GGNGGGGTCC 540 

AAATNANTTT TTTNGGCCGG GTTTNGGGGN CCCCCCNNTT TGGNTTTTTT TTTAGAAGGC 600 

CCGGNGGGGA NAAACCCCCC GGACTAAAAA AAAAAGGGGG GGANCCCCCC NGGGGNGGAA 660 

TTTTTCCCGN CCCTNAAAAG NAAAAATTTT TNTTTTCC 698 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

GAAANAANTC GGGAGAAAAA NAAANNNCCN TTAAGAGCTT GCCCCCANAG AAAAANTANN 60 

AANTNAAAAA CTGNTAGACC ANNNGAAAAG GAAGCGCAGT NAN AAAAT GG TTCCTACGGG 120 

TTAANTAAGA AGCANGACNG AAAGANNGNN TNNATNTAAC CGGGGNTAGN AAACGGCCCN 180 

CTTGTANNAG GACCNAATCG AANTAGTACG AT CAT GNT AC ANAGGGAAGG GGACGTTAC C 240 
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CNCGGANGAA 


ACCCGGCACA 


AGATCTCNNA 


AGGGAGAAGA 


TTCTGAACGN 


NANNAANCCA 


300 


CAAGGAAATT 


ACT GT GGANA 


CGGGAGGAAT 


CNATNGTNAT 


NNAGNNNAGC 


TGGNCACTTT 


360 


GANAAGGCAT 


CGATANAANT 


GAT GAT GGNT 


CAGGCGAAAG 


AG CAT AC GT A 


AAACCAAGCA 


420 


AGGNGGAATA 


GT C AT AN AAC 


CAT GN AAAAA 


ACNTTCAATA 


AAAGATNNCC 


NGAATATTGA 


480 


rn r"KrrzT &NNN A 


ANAACNCCCG 


GTGGCCGTGA 


TTCCTTTTTT 


AAC GGCAAAC 


AGCANNTTAG 


540 


TTTCAGATCA 


C C C AGAT CAT 


CGNTGNAGAT 


NCCATNGATG 


TTNTTGAAAC 


TNANCTNGAG 


600 


GATTCAAGAA 


NNGNT GACAT 


GGTGAAATGA 


TGTACAAATN 


ACAACANAGA 


NCGTCGAGAT 


660 


NNTATTCCCC 


CNGNATGNAN 


GGACNTCTTA 


T GAT GAANAC 


CT TAT AC CAG 


ACTCAAGTAN 


720 


AACNATATGA 


T C C CAT GAGG 


GNGGNNACCC 


AGGNAGT CAN 


GAANAAATAC 


CNGAGAGTTA 


780 


AATGCNTTTT 


TTTGTNTGNG 


AAC C CANT GC 


CCGACCTNTC 


AAANAGAAGC 


ANAGCCCNAA 


840 
851 



AATTAATCCA A 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 936 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 








CTAAGGAAAA 


GGTTTTAGGA 


GGGAAAACCA 


ATAGGCCCTT 


GAGTTCTTAT 


TCTTAAGACA 


60 


TTGTAAAGGA 


AAGGTTTAGG 


GGAAAAATTA 


CCAGCCCGAT 


CCATTAGGGT 


TCCAAAAGAA 


120 


CCGTTCTTCC 


AT AAAGGC C A 


GAGTT C AC C A 


TGAGTAACCA 


GGATGTTTCT 


TCGGACCTTA 


180 


TAAATATATT 


TTGAGGGGTT 


CAT GGAATTG 


GGTTGCCATT 


TGGTAGTTGG 


TAGCCTACCC 


240 


TGCTCCTTCC 


CAGTGTTGGA 


T GC AG AT AT G 


CGCCCTGTTG 


GTTTT GAGTA 


GTTTT GAGAT 


300 


CAGTCAATTT 


TAGGTTTTAT 


GGCAAGCATT 


TATTCATCCC 


CACATTTTCT 


GC CAGGGT GT 


360 


AGTAAGTGAG 


TTCTTACAGA 


GCAGAGAGAA 


GGAGCAAT CT 


GTGTTATCAA 


ATCAACTAGC 


420 


ACCAAGCACA 


CCAAGCAGCC 


AATCCTTAGA 


AGGAAGAAGC 


AAACACTTGG 


GTATCCTTCC 


480 


ATGGCTAGGA 


AATCTTCATG 


GCT CACGAAC 


CTTGGGATTT 


CCCTGTCAGG 


GTAGAATACA 


540 


AGCAGCTGAG 


AC C GAACAGG 


TATGGGTGGC 


ATGT CGAGAC 


AGGAAAAGAA 


CCTGTGTCTG 


600 


GGGAGAGGTG 


TGTGCTACAA 


AGC CAGAGAG 


AGGAACAGAT 


AGGGAGGGGT 


GTGCTGCACC 


660 


ATCATGGAGG 


GGGACAGACG 


ATTTGTCCCC 


AAGGAAAAGC 


TCCCTTTATG 


AGAGTTCTTA 


720 


CTGAATTTGG 


GAAT GACAT G 


GGAGACCAAG 


GGCCAAAGTC 


CAGAT GAG C A 


GAGT GGGGAG 


780 



I 



WO 97/39119 



PCT/US97/06067 



75 

GAGGGTTGGA AAGTTCCAAG GAGAGAGGCG TGGGGGTAAG GGAAGCTCGC AGGGCTCCGC 840 
CTCTGCCAGT GACCTTGGAC CGCTTTCTCT GAGGATCAGA GTTAT CTGTA GGGGAGATGA 900 
GGTTGAAAGA TACCCACAAT AACTTTGGCA AGTAGA 936 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 911 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

GGGAATTTAA GGGNGATTTG GAGACTTTNG AATTTTCGAA NGTT CCAAAA TAGANNTTNA 60 

GGNCAATGGG NTTGGGGCAG NGGNGCTTTT TTAAATCANA N AAGT AT TAG ATTTNTATGG 120 

AAACC CTGGG GGTTCCAGTT TAATCCCTTC AT CAT CTT GA AATATNACTT GTTT AT GGGA 180 

ANGGT GNGAT AGCAGCCNGA AACAGAGGTT TTT AT TATTA CTGTTAGAGA NGAGGATTGG 240 

GGAATAGAAC AATGAGAGTC TTGGTAATAT TNTTCNGGAA ACAACNGACA TAATTGGAAC 300 

ATTAAGGAAA TAT AT C CAT G CATTCTGTAC TTGCAAATTG CTCCAAGGAA GAT GGAGAGT 360 

ATTGTATTTC AGATAGAGAT ANGACTATAC CTGTTATTTT TTTCATTATA GCAACATTAA 420 

AAAAGATAGT AATCTAATTT CACATAACCA TTACTACTAA AGTATATATG TANTCTTTGT 480 

TTATCAGGTT TTACTTCTCA GAAATTGCAG CATCTCCTAC AGAGCCTGTC AAATGAGACN 540 

GCATAGATCC C C AG AGAACA GAGAGACTGG GAAATCATTG AAATTACACA ATCCTATCCC 600 

AAATGTTTGC GTAGACTCAA GCTCGTATCA GCTCATAAGA TCAGTGTGTG TGTGTGTTTG 660 

TGTGTGTGTG TGTCCCGCAC ATGCTTGAGT AT GCAT GTGT GCATGCATGT GTGTATGTCT 720 

ATT GC AT TAG TAGAGATGTT AAGGTTGAAT GTATTTTCTG CTCATGGTCA TTGTAAGATA 780 

TTGTGCTGTA TGTGATAAGA ATCAATGTAA CAAGGCT GGA GAGAT GACTT CAGCT GTTAA 840 

AGGCTAGACT CACT AC CAAA AATAGNGCNA TCAGTGTGAA NTTCCCCACA GGAGCTTAGC 900 

AAGNTAATAG G 911 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 781 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

TTCAGGGGTA ATCCTAAGGT AAACGGACAA AGTAAAGGGG AGGTTGGACC AATAAAGGGG 60 

AAAAATAAAA GATTAACCGG ATGTTCCCTG GAAC GAC AAA TTGCCTTGGA AGTTTCCTAT 120 

ACGGAAAAAA AT G AACAAGT TTCCTGTAAA GCAGGTAGCC GGAACGTTTC TAGGCTATAA 18 0 

ATTTAACTGG CCTTATATTT ACAAAGTCTA AACATTTTAC TGGGGCATTA CAATTTTATA 24 0 

ACACTAATTA GAT CAT GT GT GTACACCCAC AGTCTGACAG ACAGGGTATT TTTTCCTTCT 300 

TAT C CCAAGT GAGTTTAACC TTCCTTCTCC AC ATTTATT G C CAT GT GCAA TGCGTAGCTT 360 

CTATTAACTC CTGATTATTG ATTGAACTTT AT GAGAC AT A AGAAT GTACT T GAC AACAGC 42 0 

AT GT GAGAAA GGGAAAGTTG AGGGACTGAG TGTAATAGAG ACTGATAAGA AATGAATGGG 480 

CTGTGTCTGA CTCTTATCCA AC ATT C CAAT TCTTCAAGTC TAAAGGTGAA GGGTCATTTT 540 

CAATCTACTA AGTTTGAATA TGATTTGTGC TCCTGGTGTC TACAGAGTAT TAGGAAATGT 600 

TTGGTTT GTT AGGTCATTAG GGTAGGGCTC TT AT GAT AGA ATTCTTGTGG CTTTACATGG 660 

AAAGGCAGAG AGAAT ACAC C CACCCTAAAC ATTTCTGCCA TTGTGCAATA CAGTAAGGTA 72 0 

TATTTCTTTC TTTTTATTAA CTATTTGGTG AT AGT GAC AA ACAACTAGAC TT CAT AT GT G 7 80 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 389 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



781 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

TTGCTCTTAG GAGTTTCCTA ATACATCCCA AACTCAAATA TATAAAGCAT TTGACTTGTT 60 

CTATGCCCTA GGGGGCGGGG GGAAGCTAAG CCAGCTTTTT TTAACATTTA AAATGTTAAT 120 

TCCATTTTAA AT GC ACAGAT GTTTTTATTT C AT AAGGGT T TCAATGTGCA TGAATGCTGC 180 

AATATTCCTG TT AC C AAAG C TAGTATAAAT AAAAATAGAT AAACGTGGAA ATTACTTAGA 240 

GTTTCTGTCA TTAACGTTTC CTTCCTCAGT T GACAACAT A AATGCGCTGC TGAGAAGCCA 300 

GTTTGCATCT GTCAGGATCA ATTTCCCATT ATGCCAGTCA TATTAATTAC TAGTCAATTA 360 
GTTGATTTTT ATTTTTGACA TATACATGT 
<2) INFORMATION FOR SEQ ID NO: 66: 



389 
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<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

AAATCGGGNT TNCGCGATTC GGTAATGACG NCNNATCCGT AAANNCATNC GCCGNNATNC 60 

NATTNGAAAA TNCCGGGNGC AANNCGATGT CTNATT GAGG TNNCAGANCC AT C C GGCACA 120 

GGCAATANGN AAAAAANGGG AGTTTCACAA TGTNTNTGAA TNTGNANCCA TTGGGCCCNA 180 

AAAANTCCTN CGNTNNATGA ACCTTNNCGT NCAAAANTTT GGTNCGACNC AGCNGCTTTG 24 0 

CNAGCNTTNA ATAAACACCG GNNTCCANAA TGNNACCAGN GNTGTTTNTN TCNANTNGCA 300 

TNNCNNTTTG GAANCCCNCT TTTCCCAAAA CNTTNAAAAA 340 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 557 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 



AGTCCGGGNA 


TGGT GGCANA 


TGCTTTTCAT 


NCCAGCACTT 


GGGAAGGCAA 


AAAACAGTTA 


60 


NACCTNAGGT 


TTANCCCAGN 


CTTTATTAGN 


ACCCCGTGTT 


CTNAAACACA 


AACNACAAAA 


120 


NTTTGNGGGN 


NTTTAAGTGN 


AAACACTGTG 


TAAAACCTTG 


GCCCTGATGN 


AGGGNTCTCC 


180 


TTTNGAACAG 


AAAATGTTTG 


AAGANTCCNA 


AAACATGTTG 


GGATGCCANA 


CGNGTTNTTG 


240 


NGCATCCATC 


TCAACGANGT 


TTTGNGAATA 


AAT GGCAGGT 


NAAACTAGTA 


CAT CAT CAT G 


300 


TNGNANCCAC 


CGGGCNTGCA 


GATTTGT GGT 


GGGAAC C AAG 


TCCTCCCATA 


AAACAGGCTC 


360 


CTGTGGTACN 


AACAGGGCTG 


GANCCACNGA 


ATCAGTGCAG 


NTCTGGACAC 


CTGTCTGGCC 


420 


GGANGGNCTG 


GNCTAAGTNA 


ANNCAGGGGG 


GGCAAGAGCA 


TNGGAN CNAA 


CGNCAGAAAN 


480 


CGNCCCNCCC 


GGTGAGCTNT 


TCCATGCCTN 


NCCTCGNTTT 


ATTT GGCACT 


GGGCATGTCC 


540 


CAACTNAACT 


TAGGATG 










557 



(2) INFORMATION FOR SEQ ID NO: 68: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 302 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: 

GCCTATAAGT TTTGATTCCA TT C GT GAAAA TTTTTCCTAT ATCCCGAANA GTCCACTTAT 60 

TACTACTGCG GCCTATTTGG AAACTAACCG AAATTCAGTT AGTTC CCTAG TAGCCT GCTC 120 

TTGTAATATG TGTACTTTTC AATATTATAA AAAATTGGTC AGCAGATCTG AGTAAAACAG 180 

GTGAAATTCC GATCGGTAGT CCAATTTGGT TAAAGAACAG GATATCCAGT GGT CCAAGGC 2 40 

TCCAGTTTTG AACTCAAACA ATT AT C AAC C AGCT GNAAGC CCTATAGNAG TACGNAGCCC 300 



AT 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 820 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



302 



GACTGCCTTT 


TTTTTCTTCC 


CAAGGATACC 


CTGCAGCACC 


CAACAGTAAA 


AGACTTCATA 


60 


AATAGGCAGC 


TTGGAGAAGA 


AGG CAT T AC C 


ACT GAAGCC A 


TATTAAATTT 


CTTCCCTAAC 


120 


GGT C CC C GAG 


AGAAC C AAGC 


TGATGACATG 


AC C AGCTTT G 


ACTGGAGGGA 


TATATTCAAC 


180 


ATCACTGACC 


GCTTCTGCGC 


CTGGCTAATC 


AAT AC CTGGA 


GGTAAGAGGC 


AG C AAT C CAC 


240 


CCGAGGACCA 


TAGTGAACCT 


CTTAAT GT C A 


TGGGTGAGGC 


TAGAGACCTG 


TTAGCCAGTC 


300 


AGCT GGCACT 


GGATTCAGTC 


TTTCATCCTT 


C G C AC AAAGT 


GGT AAGGGT G 


CCATGGCCAT 


360 


CTGACAGACT 


TGCGTGCGAC 


TGTCCTCACA 


TCTCGATAAC 


TTCATGACTC 


CTCTGGCTCC 


420 


CCCTCTTTCC 


CTTCCAGCAC 


ACATCCATTC 


CCAGCTATCT 


CCGGGCTGCC 


ATT GT CT AAT 


480 


GACTTCT GTT 


GGCCGGTGTC 


CGCCAAACCT 


TT GAGTT GAG 


CT C ATT GATT 


GT GGACACTT 


540 


TACT C AAAGT 


TTAACAGCAT 


GT GAAAGAC C 


CCGCTGACGG 


GTAGNAATCA 


CTCAGAGGAN 


600 


CCTCCAAGGA 


ACAGCGGGCC 


ACAAGNGGTN 


AACTNAANAG 


GGTTATTGNT 


AACGGGNNCC 


660 


GGGANCNAGT 


AATCGGGNCT 


GGCCCCAANT 


AAGGGTTTGG 


GCTTTATTNN 


CNGGGACAAA 


720 
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AACCGCAAAA AAANNAAACG CCTTNTTGTA TTAAAANGCA NGNTTTTAGC CTTGGCCTGA 



780 



AATGGNGNTA AGNTACGGCC CNCNGT CAAT TCCTACTATA 



820 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 955 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 0 : 

AANCCGANAN TTTNAAAAAA CAANNANAAN GGGC CAN GAN NTNAATANTT TCTNAAAAAA 60 

NGANTACANG N AC AC GG C AG GGNNGTTTAG T CAGAAT ANA ATNNAGNGNN AACCATTGNC 120 

TTTTGAGCAG GGTTTATNGG NCTACGTTGA CCCAAGTCAC ANTGNTANCA GAGATNANNG 18 0 

AGGGGGNGGG AAGGGGTTNG GNTTTCCACA GCNTTNAAGT CAGAANTNGG AGAGACATTT 24 0 

NGCCNTGATT CANGNCTTTN CCTCCTTATT TCCNANCNTC NCATTAANAN NAGAAAAGAG 300 

TNTTTTNTTG TNTTGNGNAC AGGTGCACAA GTTTAGNANA GAGGAGACAN TGTNTAGAGA 360 

TCAGATACGG ATGAGAGTTT CCGGGGANAG TATGNGGGGA TTTTCAGTCA GNNCACT AC C 420 

CAGAANGGAT TCAGTCGNGA GGAGNCAGGG ANGGGGTGNT GGAGTTNAGA CCGANAGAGC 480 

GGNTAGCATN TAATGNNNAG AGAACACACA TNTTTTGGAT TTNAGAGACG NCCAAANCGC 54 0 

TATACANGAT NTNTC GNTAN AGGGTGAAGA GTGAAGAAAG TGATGTCTCC ANCGCANACN 600 

GGAACANGCN GCGANTTTCT TAGAGACCNA GGTTTT GAT A NAGGGAAAGT CTATTCAAGC 660 

CTCCCGTANA CTTGTAGGNC AAGNAAATAN TGCNNATTAT GAGNCCGTTG TTNTCAAACC 720 

ANGTCCCCTA TAGCAGCAAA NAGTTGN CAG AAANTCNCAC AGAGNTCCCC CGTGAGATNG 7 80 

NNNTTATNGN GGACACGATG T CAT CAAGAG GGAGTNNTGN ACTGTGACTC CAGTCCTGTT 840 

GAAGNGCATA GTAGACCATT CGCCGTGTTC ACCNACANTC AGCCNCTACC AGCNGAAAGA 900 

GNAAAGGAGA GAGTTCGCAT AT GANAGACC CCACGGGTAG TTTGCAAGTA AT GAG 955 
(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 886 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: 



NTNGAAGNAN 


AAATTNGNAA 


AAANNCCNAA 


AACCTCCAAA 


TTTGCTACCA 


NT CTTCNACG 


60 


GTNGACTTTT 


AAACAAAAGG 


AGGGGGGGGT 


TCTTNTTCAA 


ATGGGCCCCT 


TCCCAATCCT 


120 


GTTCCCNAGG 


CAATTGTTTC 


TTNTTTCANC 


NTTCAACGGT 


TTTTGGGTTC 


CATC CAACTT 


180 


TTATTTNACC 


CNTTGAGTTT 


CCTGGCCGGN 


GCCTAGGGAC 


CTCCTTTTTA 


CNTGGGCCAG 


240 


TTCCCGTTCA 


AGACNACCCG 


GCGGTTAGTG 


GNCATGGGGA 


GATGGCCCCA 


TGANTCCAAG 


300 


ACAACT GT AT 


TCCCGGTTTT 


TTAGTATTTC 


CAAGCTTCCC 


GCCAATTTTT 


CTTCCTTCCG 


360 


CTTCCAGACA 


GTTTT GCCAG 


T N AC GT GATT 


CGGTTCCGAG 


GCCCCAGCAC 


CAT GGAGANT 


420 


GCGCGCTGTA 


NTCTTAGAAG 


GGCATTCTTC 


CGCCCCACNT 


CCCGGTNTAG 


CCNGAAGGCC 


480 


CACGGAGCAA 


CGAGGAGAGC 


GACGNTNTCT 


CCACAGCCGT 


GGCTTTTTTA 


TGGTTGGCAC 


540 


TTAAGGNTTC 


GCCGCCATTT 


TGTCCGTTCN 


TNGAGTTATT 


GTGTTGAGGG 


CAAGATCTTA 


600 


CGATTGGGTT 


TTGAAGGCAT 


GGGTAGTGGC 


TTGTAGACGC 


ATGGCAGGAG 


TTGGGATTCG 


660 


TTTGGGGACA 


CTGAGGGGAA 


GCCGNTTCTT 


GGGGTGTGTC 


CCCTNGACGC 


TGTTGTGGGT 


720 


GGGGACCGGA 


ACT AGACGT G 


CCGGGCTGCG 


GCGCCCAGCG 


TGGGAGGACT 


CGCGCGGGCT 


780 


GGCAGCCGGG 


CTGGGTGTCC 


CGGCGCCTCA 


CTCACATTTT 


TTGCCACGAT 


TGTCGCCTGG 


840 


TTTGATTTCC 


CACCAATCCC 


CCAGACCGTG 


CAC GAGGAGT 


AGAAGC 




886 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GGGNGTTNGC TCTCAGATGC NAGNTACNNN TCAGGGGGNG TCTCACGAGA AAANCTNATG 60 
TGTGGGGGNT ANT NT GT AT C CCCTNNNCTC NCTCGAGANC CCNNNTCTCG ANATTTTGGN 120 
GACCNGGGGC CGGGGCCCAG ANACTCNCCA CCCCATATGG NGACCCTNTA TAAGTGTCNN 180 
CCAGGGNNTG TTTTGGGNAA AATATANCNN ANAGNGGTGT NTNTNANATC TCGGGGGGTG 
ACAGACCCNN ATTTTTTTTT AT AAAGACC C GGGGCATNTT CTCNGCCCCN TCTCCTCNGC 
T AC AN GNN AC CCACACACAG TGTGTCTCCT CTCAGCCCCC TGGCACACTT TNTNTNGANT 360 
CNGNGGGGAT ATGAGATTCN CNAGACTGGG NCCGCNNTAN TANNCNCCCC CNTGTCTCCT 420 
CTCATAGTGT NGTGTCCCCC CCTCACCCNN TNTTGNGGTN CCCTACACCC AC AC AAT NT A 



240 
300 



480 



I 
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GACTCTNCCC 


NCCNTCNGCT 


NTGNGACNCA 


CANCTGNAAA 


TCCCGNNNCN 


CAAAAAGGGC 


540 


TGTNCTCCTC 


TCTNTTACNG 


GGNGGTCNCC 


CNCNNNNGAC 


TCTNAAANGT 


CCCTCNCAAA 


600 


AGGGACNCTT 


TTCTATACAC 


NCTTANTTTN 


CCTCCTTTGT 


NTNGCAAAAA 


ANNANCCTGT 


660 


GTTNCCCCCC 


NCTTTATNAT 


NTTTNTTTTN 








ion 


TCCGGGGCCC 


CAACCCCAAA 


AT CC CANTNT 


TCTTTTNTNT 


TGGTTGGGGT 


GTCAAAATTC 


780 


CTNCCCCTAA 


ANTTTTGAAC 


CCCCTTTAAT 


TCCCCCCCCC 


GGNTNAAGGC 


CCNACTTCCC 


840 


TNGGNTNTTT 


TCNCTAAAAA 


ATTTTTTGTN 


GCCCTCCCTG 


VJOMMn 1 V— w v_ 


OUJ J. J-l X X X w 


^ KJ \J 


(2) INFORMATION FOR SEQ ID NO: 73: 










(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1033 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 


• 






(±i) MOLECULE TYPE: DNA (genomic) 








(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 








CCTACGTTCA 


CCTATGCGTA 


ACAGATCTGC 


TGTGTCAGGA 


GCCTCCTACC 


CTCGCGCATC 


60 


CTGACCCCCA 


ACCACGTCCT 


CTTAT CTGAT 


GACTGGT CAT 


CTTCCCAAGT 


CATACACCTC 


120 


ACCAGATCAC 


TCGT GGGGAT 


CTCTAGGCCA 


CCTCCTGTGG 


TACCCTAGGC 


CTT GGATCAC 


180 


TACTAACTCC 


TGCATCGTGG 


TAACCTCAAT 


* 

GGCTGATCTT 


GAGGAT GCAG 


TCTGGAGTTC 


240 


GACTCCATCA 


GGAAGCCACA 


TGGGGAGGTG 


GCTGAATGCC 


ACAGGCACCT 


ACCACATAAT 


300 


GCTTCATGTC 


CCCACAATAG 


TGTCATCAAG 


CAN C GNT AT C 


TCCCTTTGTA 


C CT GN CT AT C 


360 


ACAGTAGGCC 


CTATGTGTTG 


AAGACAGAAA 


CGTT CTNATA 


CT CAAAAT AG 


CTACCTACTT 


420 


TCATCTTTAG 


NAAAGTTATC 


ACCAGAGATT 


T CAT C ACAT G 


NCTNGGCTTA 


NGTATTTTAT 


480 


CCCCTTTCTG 


AACTATTTAT 


C AC GGGCAGA 


AAATNTACTG 


ATTATCCCTG 


TATCATGACA 


540 


TCGTGCTGNA 


GAGAAGACCC 


GAGTGGGCAG 


CAT GGNGATC 


CAAGGAGACA 


AGGGAAACCA 


600 


AGCAGCTATA 


CATAGGATGT 


CAGCAGCAAG 


CCCTTCCCTG 


CCCACGTCAG 


ACT AAACC CT 


660 


TCAGTCCCTT 


CATCTTTTCC 


TAGAAGGGTT 


TGTAATTTCT 


GTTGATTGTG 


CACCAGCGCT 


720 


TCCCAATCGC 


TGAACATCTT 


TCTTCGAATG 


TGACTCAAAG 


TGAGTGCACC 


GAGTCTGGCT 


780 


AATGTCCTCT 


GCTCCTCTTA 


ACCTCTGTGG 


CACACTCCTC 


CTAACACATG 


TGTGT CGTCT 


840 


TGTTCCACAG 


TGGCCCCACG 


GTACTGGTTT 


CAATATAGCT 


TATGTATGAG 


CAATAAGGGC 


900 


TATGTATTTT 


TTTTTTTCAG 


ACACTGTTCC 


TTTTGTATTC 


AACAACCTCC 


T C ACAT ACT C 


960 


AGCCGNACCA 


CATTTCTTCC 


AGGTCAAAAA 


CCATCTCTCC 


AATTTGTTAT 


GAATTACTCC 


1020 



wsnnnin- *-wn q7sqi ioai^ 



WO 97/39119 



PCT/US97AQXS067 



82 

TNCAAGTTCA GGT 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 883 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



540 
600 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 
GGGGGGNNAA NAATTTCCCA AAAANNGNNG GNCCCNTTTT TTATCCAGTT TNNGGTTGAA 60 

NATCTCNCCC CGGTTTNAAA ACCCNCAATG GGGAAAAAGG TACANCNGAT TNTTTATNGG 120 

TTTGGGCGGA GGGGGAAATT TTTTTGGTTT TTTTNTTTNN GGGATTTTTG AAAAAAAAAN 180 

GAANTTTTTA GGTTTCCCNN ANGTAATTTA TTTCAATGGA CCATTTTTGG GGTTCTCCCT 240 

TTTGTAANAN GTTAAAAANA AGGGANTTCC AANNTTNCTT TTCAGTTTCC AGTTTCACCT 300 

TCNGTAGCAG ACCCAGTTTT CATTTT GAGN TGGTNCCNAA AAGGNTTCCC AACTATGTTC 360 

AATACCACAG GCAGCCTGCA GGAGGGAGAA TGGGTATGTA TTTAACAGCA TTT GACCAAA 420 

TTATAAGAGC AGAGAGGAGC TTTAC CAGGG ACAGGAAGGC AAAAGAGCTG AATNTTAAAC 480 
AAAAGAATAA GAACAGGATN TCATCTGTGA GCTGTCACAG TGGGTTT73A GAGCAGGAGA 
ACACAGACAG GATTAGCTAT AAAGTTGTTA CAT T AGTT AT TNTATTGGAG CATACAATAC 

TTAAATAGTT CTAGGGCAAG AGAAATGAAC AGAAATGACC TTATAAGAGC CAGAGCTGTA 660 

GCCACAGCTT TCTTTGTGCT T AGTT TGNTA GTTCANTCTT TCCAGGGCAG TCTGGTGGAT 720 

N AC AC C AAAT TGCTTTAGAA AATGCTAGNT CTACTGTCCC TGTCTATTGT CAGCTTTGCA 780 

ATGTGCATAG TGACAGGAGT TGCCTGGGAG CTTGGGGCTT ATGTTTTGCA GATCCATTGT 8 40 
AATTAAAAAA GAATTGTAAG GAGAT GGAGG CACGGGGTGA GGG 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 892 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



883 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:75: 
GGGCCCCCCT CGAGGTCGAC GGTATCGATA AGCTTGATAT CGAATTCAGC TCTTAGCAAT 



6 
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CTGACACCCT CTTCTGGCCT CTTCAGGCAC CTGCATGGTT CCACAGGACT GTCACACCCA 120 

CGTACATAGA TAGTCAAAAT CTAGAGCACT GTTTCTATAC CTGTGAGTTG CAACCCCTTT 180 

GGGAGTGCGG TCAAATGACC CTATCACAGG GGTCTCAAAT GAGATAT CCT GCATAT CAAA 240 

TATTTACATT ATGATT CATA GTAGTACCAG AATTACAGTT ATGAAGTTAC AAAATAATTT 300 

TATAGCTGAG AGTCACCACA ACATGCATAA CTGTATTAAA ATGTTACAGC ATTAGCAAGG 360 

TTGAGAAATA CTGGTCTAGA GCCATTCCTT GTGCTGATAA AGGTGGCAGT GAGCATTATC 420 

TTTCTGTCTC CACACCACTA GCAAATTTTT TCTCTATATA TAAACATGTA AT AT GAGACA 480 

GTCTGAATCC ACT GAG GC AC GGTCTGACTC CAGAACAAAG GATCGT ATT C CTGAAAAGCA 540 

AAACGTGTGT TTGGCACTGA CTGTGTGNCC CAGGTTNTCT TTCTGNACTC CTAGAGGTCT 600 

GTANTGGGTC TTGAAGCACA GATNCTCTAA CCTTACCCTG GNNGCTCAGT AGNATGCCCC 660 

AAAACNCANG NT GTT CAACA TNGGGNNCCN CCCNGAAACA GNGNTGTNGG ATTTGGNAGA 72 0 

AAGGT GNAAT NCTTTGGGCN NNTCGGTTTA GGAATTTTAA ACANNAACTG GCTTNCNAGG 780 

TCCNTTCCGG AGTCATCCTT NCACTGGNGC CCNCTGGACC CGGNGNANNG GGCCANTTCG 840 

CCAGTTCGTN CCCCTGGNAC CCNTCNCCGG GGGCNAAANG CCCCTNNNNT TC 892 
(2) INFORMATION FOR SEQ ID NO: 76: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 884 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: 

TGGGCCCCCC TCGAGGTCGA CGGTATCGAT AAGCTTGAGG GACCCACGTG ATGGAAAGGG 60 

AGAAGCAATT TAGTGTCCTT TGTCCTCTGA CCTCCACAAG TGCTGTGGCA TGGGGACACA 12 0 

GGACTGTACA CACACACACA CACACACACA CACACACACA CACACACGCA CGCACACACA 18 0 

CCCCTCAAGT AAC CGTGGAA TAAAGGTCCG AC C AGAAACC AC GCT GGAAC GGGAGATGCT 240 

GGAGCACATC AGGGTGGTGC TAAGCAGCAG ATCGGCCTGT AACTGGCAGC AGAGGGGTGT 300 

GGCTCTTTCA GAACCAGGAG GGCATCGCCC CTCCAGCCAG ACTCTCCAGC TTTCTTCCCC 360 

TCCTTGCCTC CTGTTTTCCT TCTGCCTACC TTCCTTTGGC CT CAAAC CAT AATGTGCAAC 420 

ACATTCAAAC TGTAGTAAGT GTTTTAATTT TCTACTAAAC AATAAAACCT TTAGATTTTC 48 0 

ACT GGGC C AG TGCTGGTAAC AGCAGACTGG GTGGAGTATC ACAGAGGGTG TGGAGCAAGC 540 

TGGCTACCCA GGGCTGGGCA CACTCAACAC TCTGGCATTC TGTGGAAGTT CTGGGCAGTA 600 
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AAAACAGAAG CAT AC GT CAC GCACAGGTTC CATAGTGTTA GGCATCTTAA TCTATCTAGA 
ATACCTGGTG TTTAGTTTGT TTACAAAATT GATTGTTGTA CTTGGACAGT GGTGTTTTTT 
TCCCAGGGCT TCCAGGATTT AGGGGTATAC CAGGCCCATT ACATTGGGTA AACGTGTGTG 
TTAATTTTTT CTTTTTAAAC CTCCTTGGTT GACTACTTGT TTT CCTTTTT AATGGTCCCA 
GTTCCCCTTG GGGGGTTTGT TTTGGAAAAA GGCTTTCCGG TTTC 
(2) INFORMATION FOR SEQ ID NO: 77: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 326 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



660 
720 
780 
840 
884 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
AGCACACCAC AGAGAGGGGG TCTCCGTCCC CGAGAGGCAA AAGTCTCCCA CTGTGCTCCT 
CTCCCCCCCT GGTGGGGGTT AAGAGAT GGG GGCTCTGGGG GGTGATAGAA CCCCTGGCGG 
GACACCCCCC CGCTCTCGTG GAGAGAGACA GAGGGGGGTG C CC CT GAT AT CT C ACT AGAG 
GGGAGAGGTG AGAGGGCTCC ACAGTGTGGT GTGGTGGTGA GTGCTCTATC TCCAGGTGTC 
TCACATATTT TCACAGCTCT TGACCACAGA GAGATCTTGT TGACTCTGTG CTCGCGGAAT 
CTAATGTGCC C CAC AT C ATA TACACA 
(2) INFORMATION FOR SEQ ID NO: 78: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 557 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



60 
120 
180 
240 
300 
326 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



GGGGGGGTCT 


CACNNTANAN 


CACTCNGGNG 


TCT CC CAT GT 


CTAGATCTCC 


CCCCNGCNCN 


60 


NGNGANGAGT 


GTGNGGAGAT 


CCCTCTCTGN 


T CT CT AC ACT 


CTAAAGGGTA 


NGCGGGGAGA 


120 


GAGAGAGAGC 


ACANTCTATA 


GANCACANAG 


CACACNCGCT 


CNANGTGCCC 


NANTNACANG 


180 


NNAGAGAGAN 


CCCCTCTCNC 


AGTATATNGG 


GGAGAGAGTN 


TGAGGGACNC 


TCCTCTTTTC 


240 


TCTCAACNCT 


GNGGGGGGAG 


NGNGAGTGTT 


CTCT CTGNGG 


GGN GGAGNGG 


NACACT CNGN 


300 


TCTNCGTNTG 


NGTGCNCNNG 


TNTTCTGGGG 


GTCACANAGA 


AATCNCCTNT 


CTCAACACAA 


360 
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CAACAACAAC CCCCCGCACG NGCACACACC ACAACAACAA NGGGACANCG C GN GGGGGNT 



420 



NGNGCACACC CAGNGGAGAC ACTGTTTTCT GTTTNACACA CACACACACA CACACACACA 



480 



CNCNCCCCCC ACANAGTTTT TNGGAAAANC GCNGGGGGGG GNGGGNCTTT TTGCCNCAAG 



540 



CCTTTTTTNA NCNCCCA 



557 



(2) INFORMATION FOR SEQ ID NO : 7 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

GTCTCCCCCA AAGGGGGGGT CTCACCCTCC CGGACACCAC ACATCTGTCT GTCTCTCTGA 60 

TCTCTGACAC CCCACAGAGA TATATATAGG GACAACGCCG CTGTCCCCAT GATATAGAGA 120 

GAAGCGAGAC AAACTCTCAG GTACACATGA CACATGATCC CCATGATCCC CGGCACACTC 180 

TTCTAATATA GTT GAGAGAG TTGTGTCTCT CAAGTGTCTC TGGTATTTTC TAACCCCATG 240 

TTTTCTCTCA CAATGTCACA CGGGGGAGCT CGGACGCGGT GCACAT GGGG GAGAGTTCGT 300 

GTCTATGACA CACTAGTCTT GCCCCCGAAC CACAGAGACC TCGACTCGGG TTTAGTCTCC 360 

TCTGCCCCCC CAGCTC 37 6 
(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 533 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

ATNNCCCAAN AT CANATGNG GAANNNCCCA CATTTTNTAT NTAGAAANGN GTTTTGTGTG 60 

TGTGNGTNNA ATTTGAGNTT T CACAGAGNT N AC ATTCT CT GTGTCACAAN CCCTTTCTCT 120 

CTACACTCCA CAGTGTGGTG NGAGATATAC TNTGANACAN ATGNGCTCTC TCCTCNCCCC 18 0 

CCNNCATGTT NTNCCCCACA GTNTACNNCN NCNATATATN GNNCNCNGNA GANNGGTATG 240 

NGNGNT GTNT TTNTTTAAAA AGATNTNANA NAGNGGGTAT GCGTGNGGGG TAT GTNNAN A 300 

CATATATGTN NNAGAGGGTC TCTCTGNGGC CCNAT GGAGG CANATCCCCC CCNCTCNGAG 360 
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NNATATAGAA AAGAGTNTTT NANGGTGTTT GT GGAC AC AG ATAAGGGGAG AGAGAGAGAG 420 

AGAGANAGAG AGAGANAGAG AGAGAGAGAG AGAGAGANAN GGNGTNTTNG GNTTCNTCCC 480 

C C C CN AT AT A CAGAAAAANC GGGGGGGGGT TAGGNGGNNG GGGGTTTNCT TTA 533 
(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 346 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



180 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
TTTCACACGA GATGTCGCGA CTCTCGCGAG ACTCTCAGCG CGGAGATATA GACCCACAAG 60 
GGGAATCCCC CGGGTTTTTT GCCACAGGAG AGCGCGAGGA GAGAGATATT CTTATT AT GG 120 
CTATAGACAC CCCCGTGGGT GGGGGACATT TGTGGTGTTT CCACAGGGGG GGGGATGTAC 
C C C GGAT AT C AGAGTATTCT CTAAAAAAGG TGAGAAGAGG TCTTCTCTTT TGAGAGTATG 24 0 

GGGACACT C G AGGAGAGCTC T CT AT CT AT C TCTCACAGCG CCCCTGTGTG GGCGGATCCT 300 
CCACACCAGA TGTTAGTGTG NAGATCTCCC CATCTTCTAT ATTGAA 
(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 461 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



346 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

GAANACCCAA AATTGNGCTN GTGGGCAAAN NTTTTNCCGT TTCTTGTGCT TGNGCGGCNA 60 

AGNNAAAAAT TCAAAAC CAA NACCACANAA GCGCGTTATC CTGNCTNT CT GCCNTTNCCC 120 

TGTCACACTG NGGCTGTACA GACATCNANC GCTTT CTAGA GAGACGNGAG AGTCAGGGGA 180 

CTCTTTCCCC CANNCGCATT ATANCCACAT ATTAGNGTAN NAN ATT CAGC TGTGNTNCAC 240 

TGGGNGTGTC TCCNTAGTGT GAAGCAACAC AGGGAAACTN TTCGCNCACA TGTCCTCTGG 300 

TGT T C AC AG A NATAAGNAGG CTCCTAGACC NNTATNACTG TGGGNAGAGN AT GTTACCT C 360 

CCTATANNTC GGGGTCTATC TCTGTGAGAN AGAGNTTCCT TTCTCCCATN CCTACCTCAG 420 
TGGGGTGNTA TNT AC AT CN C AGAGAGCAGA NAACT GTGAG C 



461 
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(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



GGGGTNTCAC 


AGAGANAGGG 


CACANCTCTC 


CCNAGAANGG 


GNCNNCCCTC 


TTTTTNNGGN 


60 


GTAACACCTC 


TCNCCGTGTC 


TCTTTCTTTC 


TTTTTTNTTT 


TTTGGGGGGC 


TCTTTTTCGN 


120 


GGAGGNGGAG 


NNCGNCCGAG 


GGTC GGGCNN 


NNCNGNGGAN 


AGCTCTNTCN 


CANNGATATA 


180 


TCNCCNNANC 


CCCCCTGTNT 


CTTATAANNN 


ACATCTCTTC 


NTCNCAGGGT 


CACACCNAGA 


240 


NTCTCNTTTC 


T AC AACAAC C 


CCCACACGCN 


AAAGCTCCCC 


ACNNNGNGNG 


GGGGTCTCNC 


300 


AAGAANATCT 


CNGCGGAGAG 


GTGGNGGAGA 


GAGTGANATC 


T GNAT NT CT G 


GNTTCCCCNC 


360 


ANTGCCC 












367 
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What is claimed is: 

1 An isolated nucleic acid comprising a nucleotide sequence set forth in SEQ ID 
NO:5, SEQIDNO:6, SEQIDN07, SEQIDNO:8, SEQ ID NO: 9, SEQ ID 
NO:10, SEQIDNO:ll, SEQIDNO:12 



NO: 15, SEQ ID NO: 16, SEQ ID NO: 17 
NO 20, SEQIDN0 21, SEQ ID NO:22 
NO:25, SEQIDNO:26, SEQ ID NO:27 
NO 30, SEQIDNO:31, SEQ ID NO:32 
NO:35, SEQIDNO:36, SEQIDNO:37 
NO:40, SEQIDNO:41, SEQ ID NO:42 
NO:45, SEQIDN0 46, SEQ ID NO:47 
NO:50, SEQIDNO:51, SEQ ID NO: 5 2 
NO:55, SEQIDNO:56, SEQIDN0 57 
NO:60, SEQIDNO:61, SEQ ID NO: 62 
NO:65, SEQIDNO66, SEQ ID NO: 67 
NO:70, SEQIDN071, SEQ ID NO: 72 



SEQ ID NO: 13 
SEQ ID NO: 18 
SEQ ID NO:23 
SEQ ID NO:28 
SEQ ID NO:33 
SEQ ID NO: 3 8 
SEQ ID NO:43 
SEQ ID NO:48 
SEQ ID NO:53 
SEQ ID NO: 58 
SEQ ID NO: 63 
SEQ ID NO:68 
SEQ ID NO:73 



SEQ ID NO: 14 
SEQ ID NO: 19 
SEQ ID NO:24 
SEQ ID NO:29 
SEQIDNO:34 
SEQ ID NO:39 
SEQ ID NO:44 
SEQ ID NO:49 
SEQ ID NO:54 
SEQ ID NO: 59 
SEQ ID NO.64 
SEQ ID NO: 69 
SEQ ID NO: 74 



SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 



NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID 
NO 80, SEQ ID NO:81, SEQ ID NO:82 or SEQ ID NO:83. 

2. An allelic variant or homolog of the nucleic acid of claim 1 



3. An isolated nucleic acid encoding the protein encoded by the gene comprising 
the nucleotide sequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, 
SEQ ID NO: 8, SEQIDN0 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, 
SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, 
SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQIDNO:21, SEQ ID NO: 22, 
SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27, 
SEQIDNO:28, SEQ ID N0 29, SEQ ID NO 30, SEQIDN0 31, SEQIDNO:32, 
SEQIDNO:33, SEQ ID NO: 34, SEQIDNO:35, SEQIDN0 36, SEQIDNO:37, 
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SEQIDNO:38, SEQIDNO:39, SEQ ID NO:40, SEQIDNO:41 
SEQIDNO:43, SEQIDNO:44, SEQIDNO:45, SEQIDNO:46 
SEQIDNO:48, SEQIDNO.49, SEQ ID NO: 50, SEQIDNO:51 
SEQIDNO:53, SEQIDNO:54, SEQIDNO:55, SEQIDNO:56 
SEQIDNO:58, SEQIDNO:59, SEQ ID NO: 60, SEQIDNO:61 
SEQIDNO:63, SEQIDNO:64, SEQIDNO:65, SEQIDNO:66 
SEQIDN0.68, SEQIDNO:69, SEQ ID NO: 70, SEQIDNO:71 
SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76 
SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO.82 or 
SEQ IDNO:83. 



SEQ ID NO:42, 
SEQ ID NO:47, 
SEQ ID NO:52, 
SEQ ID NO:57, 
SEQ ID NO:62, 
SEQ ID NO:67, 
SEQ ID NO:72, 
SEQ ID NO:77, 



4. A host cell containing the nucleic acid of claim 1, 2 or 3. 

5. A nucleic acid that selectively hybridizes under stringent conditions with the 
nucleic acid of claim 1 , 2 or 3 . 

6. A nucleic acid having a region within an exon wherein the region has at least SO 
% homology with the nucleic acid of claim 1, 2 or 3. 

7. A nucleic acid having a region within an exon wherein the region has at least 60 
% homology with the nucleic acid of claim 1 , 2 or 3 . 

8. A nucleic acid having a region within an exon wherein the region has at least 70 
% homology with the nucleic acid of claim 1, 2 or 3. 

9. A nucleic acid having a region within an exon wherein the region has at least 80 
% homology with the nucleic acid of claim 1 , 2 or 3 . 

10. A nucleic acid having a region within an exon wherein the region has at least 90 
% homology with the nucleic acid of claim I, 2 or 3. 
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11. A nucleic acid having a region within an exon wherein the region has at least 95 
% homology with the nucleic acid of claim 1,23. 

12. A protein encoded by the nucleic acid of claims 1, 2, 3, 5, 6, 7, 8, 9, 10 or 1 1 



13. A nucleic acid comprising a regulatory region of a gene comprising the 
nucleotide sequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQIDNO:7, SEQ 



ID NO 

NO: 13 
NO: 18 
NO:23 
NO: 28 
NO:33 
NO:38 
NO:43 
N048 
NO:53 
NO: 58 
NO: 63 
NO: 68 
NO: 73 
NO: 78 
NO: 83 



8, SEQ ID NO: 9, SEQIDNO10, SEQIDNOll, SEQIDNO:12, SEQ ID 
SEQ ID NO: 14, SEQ ID NO: 15 
SEQ ID NO. 19, SEQ ID NO:20 
SEQIDNO:24, SEQIDNO:25 
SEQ ID NO.29, SEQ ID NO:30 
SEQ ID NO:34, SEQ ID N0 35 
SEQIDNO:39, SEQIDNO:40 
SEQIDNO:44, SEQ ID NO :4 5 
SEQIDNO:49, SEQ ID NO: 50 
SEQ ID NO: 54, SEQ ID NO:55 
SEQ ID NO: 59, SEQ ID NO:60 
SEQIDN0 64, SEQ ID NO:65 
SEQIDNO:69, SEQ ID NO:70 



SEQ ID NO : 16 
SEQIDN021 
SEQ ID NO:26 
SEQ ID NO:31 
SEQ ID NO:36 
SEQ ID NO:41 
SEQ ID NO:46 
SEQIDN051 
SEQ ID NO 56 
SEQ ID NO:61 
SEQ ID NO: 66 
SEQ IDNO:71 



SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76 



SEQ ID NO: 17 
SEQ ID NO: 22 
SEQ ID NO:27 
SEQ ID NO:32 
SEQ ID NO:37 
SEQ ID NO:42 
SEQ ID NO:47 
SEQ ID NO: 52 
SEQ ID NO:57 
SEQ ID NO:62 
SEQ ID NO:67 
SEQ ID NO 72 



SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 
SEQ ID 



SEQ ID NO: 77, SEQ ID 



SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82 or SEQ ID 



14. A construct comprising a regulatory region of claim 13, wherein the regulatory 
region is functionally linked to a reporter gene. 

1 5. A method of identifying a cellular gene necessary for viral growth in a cell and 
nonessential for cellular survival, comprising 

(a) transferring into a cell culture growing in serum-containing medium a vector 
encoding a selective marker gene lacking a functional promoter, 
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(b) selecting cells expressing the marker gene, 

(c) removing serum from the culture medium, 

(d) infecting the cell culture with the virus, and 

(e) isolating from the surviving cells a cellular gene within which the marker 
gene is inserted, thereby identifying a gene necessary for viral growth in a cell and 
nonessential for cellular survival. 



16. A method of reducing or inhibiting a viral infection in a subject, comprising 
administering to the subject an amount of a composition that inhibits expression or 
functioning of a gene product encoded by a gene comprising the nucleic acid set forth in 
SEQ ID NO l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 
NO:6, SEQIDNO:7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID 

SEQ ID NO: 15, SEQ ID 
SEQIDNO:20, SEQ ID 
SEQIDNO:25, SEQ ID 
SEQIDNO:30, SEQ ID 
SEQIDNO:35, SEQ ID 
SEQ ID NO .40, SEQ ID 
SEQIDNO:45, SEQ ID 
SEQ ID NO: 50, SEQ ID 
SEQIDNO:55, SEQ ID 
SEQIDNO:60, SEQ ID 
SEQIDNO:65, SEQ ID 
SEQIDNO:70, SEQ ID 
SEQ ID NO:74 or SEQ ID NO:75, or a 



NO: 11, 


SEQ 


ID NO: 12, 


SEQ 


ID NO: 13, 


SEQ 


ID 


NO: 16, 


SEQ 


ID NO: 1 7, 


SEQ 


ID NO: 18, 


SEQ 


ID 


NO:21, 


SEQ 


ID NO:22, 


SEQ 


ID NO:23, 


SEQ 


ID 


NO:26, 


SEQ 


ID NO:27, 


SEQ 


ID NO:28, 


SEQ 


ID 


NO:31, 


SEQ 


ID NO:32, 


SEQ 


ID NO:33, 


SEQ 


ID 


NO:36, 


SEQ 


IDNO:37, 


SEQ 


ID NO:38, 


SEQ 


ID 


NO:41, 


SEQ 


ID NO:42, 


SEQ 


ID NO:43, 


SEQ 


ID 


NO:46, 


SEQ 


ID NO:47, 


SEQ 


ID NO:48, 


SEQ 


ID 


NO:51, 


SEQ 


ID NO:52, 


SEQ 


ID NO: 53, 


SEQ 


ID 


NO: 56, 


SEQ 


ID NO:57, 


SEQ 


ID NO:58, 


SEQ 


ID 


NO:61, 


SEQ 


ID NO:62, 


SEQ 


ID NO:63, 


SEQ 


ID 


NO:66, 


SEQ 


ID NO: 67, 


SEQ 


ID NO:68, 


SEQ 


ID 


NO:71, 


SEQ 


ID NO:72, 


SEQ 


ID NO:73, 


SEQ 


ID 


homolog thereof, thereby treating the viral infection. 



17. The method of claim 16, wherein the composition comprises an antibody that 
binds a protein encoded by the gene. 
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18 The method of claim 16, wherein the composition comprises an antibody that 
binds a receptor for a protein encoded by the gene. 

19, The method of claim 16, wherein the composition comprises an antisense RNA 
that binds an RNA encoded by the gene. 

20. The method of claim 16, wherein the composition comprises a nucleic acid 
functionally encoding an antisense RNA that binds an RNA encoded by the gene. 

21 A method of reducing or inhibiting a viral infection in a subject comprising 
mutating ex vivo in a selected cell from the subject an endogenous gene comprising the 
nucleic acid set forth in SEQ ID NO: 1, SEQ ID NO.2, SEQ ID NO:3, SEQ ID NO:4, 
SEQ ID NO: 5, SEQIDN0 6, SEQ ID NO: 7, SEQIDN0 8, SEQ ID NO:9, SEQ 
ID NO: 10, SEQIDNO ll, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ 
ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ 
IDNO 20, SEQIDNO:21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ 
IDNO:25, SEQ ID NO:26, SEQIDNO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ 
IDNO:30, SEQIDNO:31, SEQIDN0 32, SEQIDNO:33, SEQ ID NO:34, SEQ 
IDN0 35, SEQIDNO:36, SEQ IDNO.37, SEQIDNO:38, SEQIDNO:39, SEQ 
IDNO:40, SEQ ID NO 41, SEQ ID NO: 42, SEQ ID NO:43, SEQ ID NO 44, SEQ 
IDN0 45, SEQIDN0 46, SEQ ID NO 47, SEQ ID NO:48, SEQ ID NO 49, SEQ 
IDNO.50, SEQIDNO:51, SEQ ID NO: 5 2, SEQIDNO:53, SEQ ID NO: 54, SEQ 
IDN0 55, SEQIDNO:56, SEQIDNO:57, SEQ ID NO:58, SEQIDNO:59, SEQ 
IDNO:60, SEQIDNO:61, SEQ ID NO 62, SEQ ID NO: 63, SEQ ID NO:64, SEQ 
IDNO:65, SEQ ID NO:66, SEQ ID NO: 67, SEQIDNO:68, SEQ ID NO: 69, SEQ 
IDNO:70, SEQIDNO:71, SEQ ID NO: 72, SEQIDNO:73, SEQIDNO:74or 
SEQ ID NO:75, or a homolog thereof, to a mutated gene incapable of producing a 
functional gene product of the gene or to a mutated gene producing a reduced amount 
of a functional gene product of the gene, and replacing the cell in the subject, thereby 
reducing viral infection of cells in the subject. 
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The method of claim 21, wherein the cell is a hematopoietic cell. 



23. A method of reducing or inhibiting a viral infection in a subject comprising 
mutating ex vivo in a selected cell from the subject an endogenous gene comprising a 
nucleic acid isolated by the method of claim 15, to a mutated gene incapable of 
producing a functional gene product of the gene or to a mutated gene producing a 
reduced amount of a functional gene product of the gene, and replacing the cell in the 
subject, thereby reducing viral infection of cells in the subject. 

24. The method of claim 23, wherein the virus is HIV. 

25. The method of claim 23, wherein the cell is a hematopoietic cell. 

26. A method of increasing viral infection resistance in a subject comprising 
mutating ex vivo in a selected cell from the subject an endogenous gene comprising a 
nucleic acid isolated by the method of claim 15, to a mutated gene incapable of 
producing a functional gene product of the gene or to a mutated gene producing a 
reduced amount of a functional gene product of the gene, and replacing the cell in the 
subject, thereby reducing viral infection of cells in the subject. 

27. The method of claim 26, wherein the virus is HIV. 

28. The method of claim 26, wherein the cell is a hematopoietic cell. 

29. A method of screening a compound for effectiveness in treating a viral infection, 
comprising administering the compound to a cell containing a cellular gene functionally 
encoding a gene product necessary for reproduction of the virus in the cell but not 
necessary for survival of the cell and detecting the level of the gene product produced, a 
decrease or elimination of the gene product indicating a compound effective for treating 
the viral infection. 
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30. The method of claim 29, wherein the cellular gene comprises the nucleic acid set 
forth in SEQ ID NO: 1 , SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, 
SEQIDN06, SEQ ID NO: 7, SEQ ID N0 8, SEQ ID NO 9, SEQ ID NO: 10, SEQ 
IDNO ll, SEQIDNO:12, SEQIDNO.13, SEQIDNO.14, 



ID NO: 16 
IDNO:21 
IDNO:26 
ID NO: 31 
ID NO:36 
ID NO:4l 
ID NO 46 
ID NO: 51 
ID NO:56 
IDNO.61 
ID NO: 66 
ID NO:71 



SEQ ID NO: 17 
SEQ ID NO:22 
SEQ ID NO:27 
SEQ ID NO:32 
SEQ ID NO:37 
SEQ ID NO: 42 
SEQ ID NO:47 
SEQ ID NO: 52 
SEQ ID NO:57 
SEQ ID NO 62 
SEQ ID NO 67 
SEQ ID NO: 72 



SEQ ID NO: 18 
SEQ ID NO:23 
SEQ ID NO:28 
SEQ ID NO . 3 3 
SEQ ID NO:38 
SEQ ID NO:43 
SEQ ID N0 48 
SEQ ID NO:53 
SEQ ID NO:58 
SEQ ID NO:63 
SEQ ID NO:68 
SEQ ID NO:73 



SEQ ID NO: 19 
SEQ ID NO:24 
SEQ ID NO:29 
SEQ ID NO:34 
SEQ ID NO:39 
SEQ ID NO 44 
SEQ ID NO: 49 
SEQ ID NO: 54 
SEQ ID NO:59 
SEQ ID NO.64 
SEQ ID NO:69 



SEQ ID NO: 15, SEQ 
SEQIDNO:20, SEQ 
SEQIDNO:25, SEQ 
SEQroNO:30, SEQ 
SEQIDN0 35, SEQ 
SEQIDNO:40, SEQ 
SEQIDN045, SEQ 
SEQIDNO.50, SEQ 
SEQIDNO:55, SEQ 
SEQIDNO:60, SEQ 
SEQIDNO:65, SEQ 
SEQIDNO:70, SEQ 



SEQ ID NO:74 or SEQ ID NO:75, or a 



homolog thereof. 



3 1 The method of claim 29, wherein the cellular gene is a gene identified by the 
method of claim 15. 

32. A method of screening a compound for reducing or inhibiting a viral infection, 
comprising administering the compound to a cell containing the construct of claim 14 
and detecting the level of the reporter gene product produced, a decrease or elimination 
of the reporter gene product indicating a compound for reducing or inhibiting the viral 
infection. 



33. A purified mammalian serum protein having a molecular weight of between 
about 50 kD and 100 kD which resists inactivation in low pH and resists inactivation by 
chloroform extraction, which inactivates when boiled and inactivates in low ionic 
strength solution, and which when removed from a cell culture comprising cells 
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persistently infected with reovirus selectively prevents survival of cells persistently 
infected with reovirus. 



34. A method of selectively eliminating, from an animal ceil culture capable of 
surviving for a first period of time in the absence of serum, cells persistently infected 
with a virus, comprising propagating the cell culture in the absence of serum for a 
second time period which a persistently infected cell cannot survive without serum, 
thereby selectively eliminating from the cell culture cells persistently infected with the 
virus. 

35. The method of claim 34, wherein the second time period is from about three 
days to about ten days. 

36. The method of claim 34, further comprising transferring the cell culture from a 
first container to a second container. 

37. A method of selectively eliminating from a cell culture cells persistently infected 
with a virus, comprising propagating the cell culture in the absence of a functional form 
of the protein of claim 33. 

38. A method of reducing or inhibiting a viral infection in a subject, comprising 
administering to the subject an amount of a composition that inhibits functioning of a 
serum protein having a molecular weight of between about 50 kD and 100 kD which 
resists inactivation in low pH and resists inactivation by chloroform extraction, which 
inactivates when boiled and inactivates in low ionic strength solution, and which, when 
removed from a cell culture comprising cells persistently infected with the virus, 
prevents survival of cells persistently infected with the virus, thereby reducing or 
inhibiting the viral infection. 

39. The method of claim 38, wherein the composition comprises an antibody that 
binds the serum protein. 
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40. The method of claim 38, wherein the composition comprises an antisense RNA 
that binds an RNA encoded by the gene. 

41. A method of identifying a cellular gene that can suppress a malignant phenotype 
in a cell, comprising 

(a) transferring into a cell culture incapable of growing well in soft agar a vector 
encoding a selective marker gene lacking a functional promoter, 

(b) selecting cells expressing the marker gene, and 

(c) isolating from selected cells which are capable of growing in agar a cellular 
gene within which the marker gene is inserted, thereby identifying a gene that can 
suppress a malignant phenotype in a cell. 

42. A method of identifying a cellular gene that can suppress a malignant phenotype 

in a cell, comprising 

(a) transferring into a cell culture of non-transformed cells a vector encoding a 

selective marker gene lacking a functional promoter, 

(b) selecting cells expressing the marker gene, and 

(c) isolating from selected and transformed cells a cellular gene within which the 
marker gene is inserted, thereby identifying a gene that can suppress a malignant 
phenotype in a cell. 

43 A method of screening for a compound for suppressing a malignant phenotype in 
a cell comprising administering the compound to a cell containing a cellular gene 
functionally encoding a gene product involved in establishment of a malignant phenotype 
in the cell and detecting the level of the gene product produced, a decrease or 
elimination of the gene product indicating a compound effective for suppressing the 
malignant phenotype. 

44. A method of suppressing a malignant phenotype in a cell in a subject, comprising 
administering to the subject an amount of a composition that inhibits expression or 
functioning of a gene product encoded by a gene comprising the nucleic acid set forth in 
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SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, 
SEQ ID NO;81, SEQ ID NO:82 or SEQ ID NO:83, or a homolog thereof, thereby 
suppressing a malignant phenotype. 

45. The method of claim 44, wherein the composition comprises an antibody that 
binds a protein encoded by the gene. 

46. The method of claim 44, wherein the composition comprises an antibody that 
binds a receptor for a protein encoded by the gene. 

47. The method of claim 44, wherein the composition comprises an antisense RNA 
that binds an RNA encoded by the gene. 

48. The method of claim 44, wherein the composition comprises a nucleic acid 
functionally encoding an antisense RNA that binds an RNA encoded by the gene. 
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